OGSA Teleconference - 17 August 2006 ==================================== * Participants Jay Unger (IBM) Jem Treadwell (HP) Ellen Stokes (IBM) Andreas Savva (Fujitsu) Darren Pulsipher (EMC) Allen Luniewski (IBM) Chris Jordan (SDSC) Donal Fellows (Uom) Michel Drescher (Fujitsu) Stephen Davey (NESC) Dave Berry (NESC) Minutes: Andreas Savva * Summary of Actions AI-0817a: Dave Berry and Jay Unger will approach groups (GIN, Globus, etc) that have implemented practical grids and start a discussion on how they handle data: - how data is treated as a resource that can be scheduled - how transfers are modeled - how files are advertised - how applications find files (query, by knowledge,?) AI-0817b: Stephen Davey will do a cut down parameter sweep scenario focusing on the JSDL+BES+Data Transfer/Staging service AI-0817c: Andreas will talk with Hiro about getting a 1-2 hour slot on Thursday. It is important to know early next week---by the Monday (8/21) call. * Aug 10 minutes - not available * Aug 14 minutes - approved with no changes * July F2F minutes - Roadmap - approved with no changes * Review of Actions AI-0803a: Hiro to look into an option for a second EMS/Data call on Sep 7. Tentative; needs confirmation with likely participants. - Agreed to schedule this for Aug.31, preferably in the second part of the call. - Data will provide updates (if any) from their F2F at the end of August. AI-0803c: Ellen to book the IBM facility on Thursday afternoon - Done. - Andreas and Hiro to revise the EMS scenario document (i.e. examples) so that proposal 2 is fulfilled + Andreas will merge Hiro's updates into the next draft - Done - Steve McGough & Andreas to do a mapping from the current JSDL elements / terminology into the new information model structure (Proposed split into Description and Requirements) (from July 18th F2F, postpone until Aug. 17) - By GGF18 as original discussed. * OGSA Information Model and Data Ellen gave an overview of the Information Model work. - Slide 6 shows a middle abstract model. But it might be the case that different abstract model might be necessary for different purposes. One example in Data is where the information for discovery may be different than that for scheduling. - Implementation could be a (set of) XML document(s) or a database. And XQuery is supported by DBs as alternative to ODBC etc - Slide 11 shows a symmetric model. Expressing and matching between resource requirements and job capabilities may involve policy---there are many standards that may be applicable but the group has not worked through them yet. - For Data there are a number of open questions around the idea of different models for discovery vs scheduling. What are the interesting properties for Data scheduling? What information is needed for modeling and at what level of abstraction? - Agreed that there needs to be an effort to tackle the scheduling issue between EMS and Data - Data has so far mostly focused on access rather than scheduling. For example, issues with data availability on the network, or performance characteristics are not modeled - What are the necessary abstractions for these? (fast data, available, ...?) - There are a number of groups that should be consulted to get a better idea of current practice (GIN,Globus,etc). The impression is that the state of the art is - a metadata catalog---what is available - a replica catalog---abstract filename to physical location - choice of what to use is up to application AI-0817a: Dave Berry and Jay Unger will approach groups (GIN, Globus, etc) that have implemented practical grids and start a discussion on how they handle data: - how data is treated as a resource that can be scheduled - how transfers are modeled - how files are advertised - how applications find files (query, by knowledge,?) - The Data group is also thinking about a file metadata definition (something like the information retrievable from a POSIX stat()) - CIM has existing work on files and Ellen can help out * Common Data/EMS Scenarios Data has a set of scenarios describing services that interact with EMS. For example, the parameter sweep scenario. EMS also has some expectations, for example that a data staging description from a JSDL document will be correctly processed by some data transfer service---even though that service may not be described further. So it would be useful to have a set of common scenarions describing basic common activities, making sure that the overlap is checked and the interaction between Data and EMS services is done consistently. Agreed that a good first candidate scenario is simple job execution with data staging. In particular the combination of JSDL, BES, Data transfer service (DMI based?). AI-0817b: Stephen Davey will do a cut down parameter sweep scenario focusing on the JSDL+BES+Data Transfer/Staging service The aim is to discuss it at least within Data at the Data F2F (Aug 28-30 @Edinburgh) and with EMS at GGF18. Since many Data members will be leaving on Thursday it would be good to arrange some time early on Thursday (OGSA F2F). - Confirmed that a number of people from EMS on the call (Donal, Andreas) will be at the F2F. AI-0817c: Andreas will talk with Hiro about getting a 1-2 hour slot on Thursday. It is important to know early next week---by the Monday (8/21) call.