-------------------------------------------------------------------- SAGA Use Case Template: ======================= Name of use case: GRID-TLSE Contact (name and address): Michel Daydé, ENSEEIHT-IRIT, 2 rue Camichel 31071 TOULOUSE CEDEX FRANCE dayde@enseeiht.fr Authors (if different form contact) ................................ 1. General Information: ----------------------- This section consists of check-boxes to provide some context in which to evaluate the use case. 1.1 Which best describes your organisation: Industry [ ] Academic [x] Other [ ] Please specify: ................................... 1.2 Application area: Astronomy [ ] Particle physics [ ] Bio-informatics [ ] Environmental Sc. [ ] Image analysis [ ] Other [x] Please specify: Sparse linear algebra solvers 1.3 Which of the following apply to or best describe this use case Multiple selections are possible, please prioritize with numbers from 1 (low) to 5 (high): Database [ ] Remote steering [ ] Visualization [ ] Security [ ] Resource discovery [ ] Resource scheduling [ ] Workflow [ ] Data movement [ ] High Throughput Computing [ ] High Performance Computing [ ] Other [x] Please specify: Expert site for sparse linear algebra 1.4 Are you an: Application user [ ] Application developer [x] System administrator [ ] Service developer [ ] Computer science researcher [ ] Other [ ] Please specify: ................................... 2. Introduction: ---------------- 2.1 Provide a paragraph introduction to your use case. Background to the project is another alternative. (E.g. 100 words). The goal of the Grid-TLSE project is to design an expert site for sparse matrices. This web site provides tools and software for sparse matrices. It also includes a bibliography on sparse matrix software and collections of sparse matrices. 2.2 Is there a URL with more information about the project ? http://www.enseeiht.fr/lima/tlse 3. Use Case to Motivate Functionality Within a Simple API: ---------------------------------------------------------- Provide a scenario description to explain customers' needs. E.g. "move a file from A to B," "start a job." Please include figures if possible. If your use case requires multiple components of functionality, please provide separate descriptions for each component, bullet points of 50 words per functionality are acceptable. Example of scenario : determination of the best symmetric permutation - Identify all the symmetric permutations available on the TLSE grid for each symmetric permutation do select one serve offering the symmetric permutation send the matrix to the server start calculation recover the results done - Determine the symmetric permutation that provided the minimun fill-in. - Send back the answer to the user 4. Customers: ------------- Describe customers of this use case and their needs. In particular, where and how the use case occurs "in nature" and for whom it occurs. E.g. max 40 words There are two classes of users : - expert users that are deploying sparse linear algebra sofware over our grid and proceed to experiments aiming at comparing, debugging, ... their sofwares. - non-expert users that are mainly interested by uploading their matrices and determine which solver or which machine is the most efficient in terms of memory, execution time,... for their problems. 5. Involved Resources: ---------------------- 5.1 List all the resources needed: e.g. what hardware, data, software might be involved. . Softwares : large number of sparse softwares . A large database that hold description of the solvers, the computers available, informations related to the users, results of experiments and matrix collections. . The TLSE gird will involve a wide range of computers from PC up to large parallel computers so that users may proceed to a wide range of experiments. 5.2 Are these resources geographically distributed? At national level and we may anticipate utilization of computers in foreign countries. 5.3 How many resources are involved in the use case? E.g. how many remote tasks are executing at the same time? Typically we anticipate tens up to hundreds. 5.4 Describe your codes and tools: what sort of license is available, e.g. open or closed source license; what sort of third party tools and libraries do you use, and what is their availablility; do you regularly work from source code, or use pre-compiled applications; what languages are your applications developed in (if relevant), e.g. Fortran, C, C++, Java, Perl, or Python. The sparse linear algebra software is either open source or available in libraries such as Harwell Subroutine Library. The code we deveoped is open source. DIET is also publically available. 5.5 What information sources do you require, e.g. certificate authorities, or registries. The expert site only requires a on-line registration to identify users. 5.6 Do you use any resources other than traditional compute or data resources, e.g. telescopes, microscopes, medical imaging instruments. NO 5.7 How often is your application used on the grid or grid-like systems? [x] Exclusively [ ] Often (say 50-50) [ ] Ocassionally on the grid, but mostly stand-alone [ ] Not at all yet, but the plan is to. 6. Environment: --------------- Provide a description of the environment your scenario runs in, for example the languages used, the tool-sets used, and the user environments (e.g. shell, scripting language, or portal). We use Java, DIET on top of CORBA (OmniORB), XML, HTTP and Java Server Pages for the web interface. We use PostGresQL for the database. The linear algebra solvers are mainly written in Fortran 77 or 90, C or C++. 7. How the resources are selected: ---------------------------------- 7.1 Which resources are selected by users, which are inherent in the application, and which are chosen by system administrators, or by other means? E.g. who is specifying the architecture and memory to run the remote tasks? Users are choosing softwares (e.g. sparse solvers) and may additionally select an architecture (e.g. PC LINUX or SGI,...). Selection of the machine(s) - this is a multi-parametric application - running the remote tasks depends on the availability of the software, the load of the computers, considerations on the availability of data on the machines,.... 7.2 How are the resources selected? E.g. by OS, by CPU power, by memory, don't care, by cost, frequency of availability of information, size of datasets? Resources are selected partly at the expertise level (depending on the user requirements in terms of software, size of problem, type of architecture he wants to experiments,...) and at the middleware level (CPU load, management of data persistency,...). 7.3 Are the resource requirements dynamic or static? The resource requirements are dynamic. Furthermore, an expertise often consists of several steps (each step is launching a bunch of executions over the grid) and executions at a given step depends from the results of previous steps. 8. Security Considerations: --------------------------- 8.1 What things are sensitive in this scenario: executable code, data, computer hardware? I.e. at what level are security measures used to determine access, if any? We rely on the mechanisms that are - or will be available - at the Corba level on which DIET stands. 8.2 Do you have any existing security framework, e.g. Kerberos 5, Unicore, GSI, SSH, smartcards? We currently use some SSH. 8.3 What are your security needs: authentication, authorisation, message protection, data protection, anonymisation, audit trail, or others? Authentification, data protection are certalinly the main issues for our project. 8.4 What are the most important issues which would simplify your security solution? Simple API, simple deployment, integration with commodity technologies. Keep it as simple and as hidden as possible. 9. Scalability: --------------- What are the things which are important to scalability and to what scale - compute resources, data, networks ? For our project : compute resources and data. 10. Performance Considerations: ------------------------------- Explain any relevant performance considerations of the use case. Since we are not providing computational resources GFlop rate is not the main issue. Probably throughput is the main thing of interest. 11. Grid Technologies currently used: ------------------------------------- If you are currently using or developing this scenario, which grid technologies are you using or considering? DIET middleware developed at LIP-ENSL (France) by F. Desprez and his team. 12. What Would You Like an API to Look Like? -------------------------------------------- Suggest some functions and their prototypes which you would like in an API which would support your scenario. Something like Grid_execute('name_of-procedure', parameters ) with both synchronous and asynchronous calls and way of testing end of execution is fine. Additionally something sending back the list of services that are capable of executing a specific operation (giving the name of a procedure - i.e. who can execute GEMM ? - or at a higher level - who is capable of executing a matrix multiplication with symmetric matrices ? -) would be of interest. 13. References: --------------- List references for further reading. Please refer to our website http://www.enseeiht.fr/lima/tlse. --------------------------------------------------------------------