GGF14 DAIS Session 1 ==================== Notes: Mario Antonioletti Norman Paton opens the session. This (first DAIS) session will cover: Review of progress see https://forge.gridforum.org/projects/dais-wg/document/GGF14-Session1-Intro/en/1 Review of the WS-DAI specification see https://forge.gridforum.org/projects/dais-wg/document/GGF14-Session1-WS-DAI/en/1 The second DAIS session will cover: WS-DAIR https://forge.gridforum.org/projects/dais-wg/document/GGF14-Session2-WS-DAIR/en/1 WS-DAIX https://forge.gridforum.org/projects/dais-wg/document/GGF14-Session2-WS-DAIX/en/1 WS-DAIO https://forge.gridforum.org/projects/dais-wg/document/GGF14-Session2-WS-DAIO/en/1 Group has been going for quite some time. BOF at GGF4. Hope to submit core specs in July 2005 (or well before GGF15). Submitted specs for GGF14 rather late as we were trying to get the documents as close to final submission as possible. Some of the changes in WS-DAI have not quite propagated to the realization specs. Have taken long as we have trying to work out our scope and dependencies. Started using OGSI. Scarred by the OGSI experience. Not all DAIS participants/vendors are committed to WSRF. Also, WSRF is not suitable for all data resources nor do they map naturally. Have spent a considerable amount of time trying to explore the mapping options. Now have a core that does not have a dependency on WSRF. This has a fair amount of functionality but not soft state management and other things that would be gained through the use of WSRF. Hence have introduced a layer over a core set of functionalities that do not have a dependency on WSRF. Susan Malaika now takes over to present the present status of the WS-DAI. (content is mainly as those in the slides presented) Dave Berry: My memory of the GGF13 DAIS session where the conduit was introduced was that the DBMS was the conduit and the tables and other things were managed by DAIS. Dave Pearson: the DBMS was acting as a conduit... Susan: that was the idea in Seoul but it has evolved since then. We are trying to take into account which objects are managed by DAIS and which are not. Also trying to create a framework that works with and without WSRF. Dave Pearson: in your slides you said that the data conduit is optional and that it allows you to access WSRF and non-WSRF service. Are there any limitations in not using the conduit? Susan: one of them is the way that you deal with properties - without WSRF you have one operation to get the whole property document you do not have a finer granularity of access ... Bruce Barkstrom: this appears to be written in the context of DBMSs - what happens with data in very large collections that have very large files. What do you do with that? Also, in your data catalogue do you allow for the inclusion of remote catalogues - for instance external libraries and references to other catalogues? Susan: first dealing with large resource - you can define an interface that allows you to get pieces of your data back at a time ... our intent that this is flexible. Bruce: Earth sciences calls that a subset within a file and you would not use a query to extract this information but a function. You also have supersetting - where you collect pieces of data together and put them together and send them back, e.g. in meteorology you could extract pictures of hurricanes from images kept in different files and stored in different locations. Do you support this? Jon Myers: I am from the DFDL group - have been looking at the XML realization - if you have a DFDL description that maps over multiple files you could use this to provide this kind of behavior ... Susan: so what you are saying is that you can describe the whole file system as though it were a single document - a view - and this view mechanism might provide a way of looking at many files in different locations in a file system to present a global view - is that right? Jon Myers: yes Susan: for the catalogue part of the question ... you asked if we could deal with a distributed catalogue - the catalogue could be a real catalogue or a distributed catalogue ... we only describe the interface ... have a catalogue interface what data you have access to Bruce: it is not the distributed part but the granularity of the description that concerns me - in the previous example the hurricanes are not in the catalogue - these were determined by a data mining application that found these hurricanes and created a new catalogue containing this information ... Susan: you need to look at the specifications to see if it satisfies your requirements... Norman: I would have answered the question differently - the word catalogue is misleading.... nothing to stop people from defining registries our catalogue only defines the resources that the services have. It could be generalized but I would not use it that way ... Susan: so maybe catalogue is not such a good name ... it only has a list of names that are accessible through the service Savas: when you presented the conduit you suggested that it is optional - is the design of the conduit optional or is it in the implementation technology that is optional? Susan: I would say it is optional in the architecture ... it is there to have lifetimes managed and finer grain access to properties .. Savas: to introduce an artifact in the technology only to add some capabilities does not seem productive to me .... Susan: it makes it possible for people that do not want to use WSRF to use DAIS ... ... Savas: I understand the design decision ... I would prefer the conduit to be part of the architecture. If you use WSRF then every message is directed through the conduit but in the case where you do not have WSRF then the name becomes the conduit. Susan: the name is always there .. Savas: if it is not the name that identifies the conduit then what is it? ... it is the operations that you get through WSRF - you have a WSRF way of manipulating things... ... Savas: there is a 1-1 association between a conduit and WSRF? Susan: yes ... Norman: a conduit is a WS-Resource ... we don't like the name conduit Savas: I thought the conduit was introduce to allow different implementations to be used but it now only seems to be there if you want to use WSRF ..... Not objecting to the use of WSRF ... I'm making the point that from a design point of view you are introduce a concept to only use WSRF when you can use WSRF without using the conduit. Norman: the question that Savas is asking is mainly a presentational one as opposed to a disagreement in the content of what is being said ... ... Savas: if you were to tell me that every message had to go through a conduit to identify the resource then I would have understood what you meant .... ... Norman: if you use WSRF, you have an EPR that identifies the resource, which then gives you the option as to whether you include an abstract name in the messages or not. It is the case that the type of the messages is the same whether you use the conduit or not ... if you pass a name to something that is uniquely identified by WSRF then passing that name is unnecessary but still possible. You could mandate that you always pass a name even in WSRF which means .... the client code used will look the same for both cases .... so we could require a name in every case ... ... Susan: we do not mandate the names in WSRF operations only in the DAIS operations ... ... Paul Watson: wish to applaud the effort of the group - what you have come up with is pragmatic and workable. The next thing is best practice - people will want to use this ... they may over use some of the features - for instance the dynamic way of creating data resource. I see why this is useful but I feel it could be badly used - sky query have a similar framework - use select into which allows you to keep it in the data resource and that offers large advantages .... Allows you to address Bruce's question but if someone reads the spec they might think too much about creating data resources ... the suggestion would be to come up with a best practice document. Greg Riccardi: if one said create a data resource - a sensible is to do a create table from select ... the data resource does not have to be created it can just be a create view ... the implementation could do this ... Dave Pearson: the spec is providing generic capabilities .... For instance the JSDL people could say that you could do this by submitting a JSDL document ... the best practice guide would be a good idea ... but until the implementations come out it's difficult to state what best practice is ... Dave Berry: maybe there is a research project ... Bruce: is it worth talking about scenarios in this context ...? [ Here is scenario supplied by Bruce for DAIS: https://forge.gridforum.org/projects/dais-wg/document/SubsettingandSupersetting-UseCaseforDAIS/en/1 ] ...