GFS-WG #2 ARCH Tuesday, Oct 4, 2005 2pm to 3:30pm Note taker: Ann Chervenak (ISI) Agenda ------ 1. Distributed Storage Tank Experience (Manuel Pereira, IBM Almaden) 2. Service Oriented Architecture (Manuel Pereira, IBM Almaden) 3. GFS Architecture: diagram, functionality and scenarios (Osamu Tatebe, AIST) Meeting ------- 1. Distributed Storage Tank Experience (Grid SAN File System) o Overview picture of distributed storage tank o Global namespace: need a unified hierarchical namespace for distributed and heterogeneous sources - Enables file system grid - Has become the RNS - Will be moving into naming group o VFDS: File-system specific rendering of RNS is also being defined o Examples of mapping tables - Allow general capability of using various resolvers for names o GNS rendering examples - NFSv2/3: using automounts - NFSv4, CIFS: dynamic mounting using referrals; client talks directly to NFS, which has a link to RNS namespace service - Other clients (SAN and other file systems) can go through a proxy cache/protocol gateway; communicates with RNS directly and can mount diverse data source servers o Reference implementation of RNS is heading toward open source - Goal is to distribute it in Globus Toolkit - Two clients: API with command line and GUI Question: does this look like a UNIX interface? Two sets of operations: namespace operations and storage operations At the VFS level, can operate on both levels If done right, should be able to distinguish at the Grid application level between these two types of operations Should be able to say at the Grid level: create a file, but a pointer to it is already in the namespace Another e.g.: what about a move: are you changing the pointer in the namespace or the data or both? Separating functionality of namespace management and storage raises consistency issues Problem already exists with legacy storage, with multiple access mechanisms: hard to guarantee consistency May have a layer in new systems that does validation/verification and determine changes Have considered a separate consistency service Question: Storage resource management services exist; leave namespace management to another component What is scope of specification? Does it include storage management or just namespace management? To be answered later... 2. Service Oriented Architecture o Service-oriented architecture discussion and definition - Self-contained services, do not depend on state of other services - Allows consumer to discover associated services under the service accessed Question: Does this imply that the GFS is coordinating access to lots of other services, including GSM? Or is it just a namespace management service? Two parts of this group's charter: a. namespace management from RNS b. grid file system architecture So, RNS is a service that is leveraged in a Grid file system architecture RNS could be factored out Hierarchical naming as one part Resolving would be a separate part, perhaps consolidate in a single service with WS-Naming, OREP services for resolution Want to coordinate with OGSA-D and other working groups GFS: two possibilities 1. specification as a profile; GFS specification rendered as a profile for implementation; client that wants to participated in a GFS, follows the spec document, implements the spec and knows how to talk to underlying services; requires more effort to develop a client; updates to underlying services would be more difficult 2. implement a GFS service to which clients communicate; not a gateway service; initially talk to GFS (similar to access to a DNS server), and that server coordinates contact with other services; question is whether GFS service would actually be lightweight or not; would like to leverage a transaction management service to hold request state Second approach is complicated because it has to leverage existing services that are moving targets Comment: could have it both ways, could do a reference implementation of a profile; may not need to use the coordinator in that space Specification language requires "MUST" statements, which can be problematic; hard to say that the client must do X Coordinator service might just repeat all the service interfaces of the lower-level services 3. Grid file system architecture o Identify necessary services and interactions among them o Discussion of file system functionality (file content read/write access, locking, manipulation, attributes, directory operations, file system mounting, replica management) Comment: Can be dangerous to superimpose locking on legacy systems because there may be other access mechanisms Comment: At Grid level, move a file is not necessarily a local operation. Comment: Copy a file can be a complex operation Comment: Some of these operations affect the storage system not the namespace (e.g., read/write access operations) Some both: file locking, file manipulation, etc. o Scenarios for : - Reading a file (read-only) - Updating a file (read-write) E.g., talk to VFDS to get pathname to file locations; select among file locations; access to read or update the file; VFDS: update or invalidate metadata if needed Comment: ambitious, deals with file updates - File locking Need to add a namespace locking mechanism in VFDS or provide separate locking service (for either namespace or storage system locking) - Create a file Request storage allocation; access (create a file) and VFDS to register pathname - Remove file VFDS to find file locations, remove path names; storage access to remove all files if needed - Move a file VFDS moves the pathname in the namespace ONLY - Access control Owner/group/other, ACLs Security level mismatch among services: how to enforce access control?? - File attributes - Directory operations - File replica management Discovery or selection; use VFDS to find file locations; select among locations; file copy operations; remove source in migration case; validate new copy; add new file location in VFDS o Need to map all this functionality to specific services