This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /dmsf_files/7393?download=11587 at Fri, 04 Nov 2022 18:38:29 GMT OGSA, WS-RF and CEA

OGSA, WS-RF and the Common Execution Architecture

Guy Rixon (gtr@ast.cam.ac.uk); 2007-09-19

These notes describe some ways that virtual-observatory software might use an OGSA platform and the associated requirements for that platform. AstroGrid's Common Execution Architecture is examined as a possible point of linkage between IVOA and OGSA grids.

CEA as a computational grid

The Common Execution Architecture (CEA) is AstroGrid's pattern for driving web services in workflows. CEA has these characteristics:

A system based on CEA is a computational grid. It supplies astronomical processing via a standard interface.  A CEA service is a specialized form of the job-manager pattern. It is less general than the job managers in generic grid-toolkits, e.g. Globus, because the jobs that can be run are constrained by the service provider; users cannot supply arbitrary programmes. The applications available on a CEA service can be made into grid commodities by standardizing their interfaces and the descriptions of those interfaces. However, the raw processing-power supportting the service is not available as a commodity.

In principle, a CEA service can be adapted as a facade for a local batch-queuing system, such as Condor or GridEngine. This has not yet been done; current CEA services run their applications either as in-process calls to Java classes or by forking a local sub-process for each call.

CEA also supports MySpace, AstroGrid's implementation of a data grid. Inputs and outputs of applications can be defined to be files in virtual storage; CEA uses temporary, local copies of these files to hide the data grid from the application code. CEA will support  VOSpace, the  IVOA standard for data grids, when that standard comes into use.

CEA is integrated with the registry of computing resources as defined by IVOA. The descriptions of applications used to make calls to CEA services are also used in the registration records of those services.

AstroGrid's approach differs from that of most other virtual-observatory projects. The majority of virtual-observatory services so far published are synchronous (the work completes or fails during a single HTTP call to the service) and have WSDL contracts that are specialized to what they do. CEA avoids synchronous processing in order to support long-running operations. It avoids application-specific contracts in order to simplify the interface to the workflow engine.

It is worth noting that the application-description language in CEA was originally intended to be a small extension to WSDL: this would have maintained the application-specific contracts for the services. It turned out to be easier to use standard WSDL, with a common contract for all CEA services, and to make the extensions as schemata used in the "types" section of the contract.

IVOA, CEA and WS-ResourceFramework

The IVOA recognizes the need for asynchronous processing, even though the existing services mostly work synchronously. IVO has a draft standard for using WS-ResourceFramework to manage an asynchronous job-step. This standard addresses the initiation, management and monitoring of job-steps, with particular emphasis on notification of completion of the work.

CEA pre-dates WS-RF and the IVOA standard for its use; it makes no use of WS-RF and has its own libraries and interfaces for managing asynchronous work. The CEA semantics for managing job steps are essentially identical to those of WS-RF. If IVOA approves the use of WS-RF (as opposed to alternatives based on WS-CommonApplicationFramework or WS-Coordination), then it makes sense to refactor CEA to use WS-RF also.

Thus, we expect to see a new generation of astronomy web-services that are "grid services" by virtue of using WS-RF; i.e. they use the same "plumbing" as standard services specified by GGF. Some of these new services will by synchronous; some will be asynchronous. Some will have the CEA-standard interface; some will have application-specific interfaces. None will necessarily comply with OGSA.

IVOA and OGSA

Virtual observatory software could converge with OGSA. This could mean either or both of two things:

IVOA has standards (many still in development) for the following areas:

IVOA does not have standards in job-based computing, either for job-manager services or for workflow systems. However, virtual-observatory projects that are partners in IVOA have internal standards for these; CEA is AstroGrid's example. It is likely that IVOA will eventually standardize one or more of these systems.

From the list above, which service could plausibly implement OGSA interfaces?

The mood in IVOA is that it is not especially helpful for us to implement our services as part of the OGSA platform. The general preference is to make the virtual observatory a client of the OGSA platform such that the observatory may make use of generic grids. More specifically, we aim to make our software a client of complete OGSA platforms supplied by other developers. Because of this bias, most participants in IVOA do not have detailed lists of requirements for OGSA; rather, they assume that an OGSA platform can be driven from the virtual observatory via some gateway that will be written in the astronomical community. In this mode we have some general requirements:

It is important to note that the virtual observatory is itself a grid. Rather than interfacing to OGSA at the service level, it may be more useful to drive job-managment systems via DRMAA from within virtual-observatory services.

It is possible that some operations and schemata for service metadata are common to many services in the OGSA platform. If this is the case, then, presumably, there will be utilities that drive these common features: e.g. metadata browsers, service up-time monitors, utilities to manage WS-Resources. In this case, we would like to be able to use these utilities on virtual-observatory services. Therefore, we urge GGF to make these parts of OGSA an overt standard that will be supported by platform implementors.

It is also conceivable to build an adaptor such that the entire virtual-observatory platform can be driven as if it were an OGSA platform. This would be valuable if there arose workflow systems for OGSA comparable to the Trianna system: i.e. complete systems for defining a scientific computation that call into OGSA to collect, process and store the data. It is rather harder to build this kind of adaptor than to build the client-side adaptor for the virtual observatory to call OGSA. It is particularly difficult to make the astronomy-specific parts of the virtual observatory look like the OGSA platform. To do so requires job-oriented services inside the virtual observatory and this brings us back to AstroGrid's CEA.

CEA and OGSA

It is conceivable that CEA might be redesigned such that CEA services follow the OGSA specification for a job-manager service. It is a feasible change because the CEA-service definition is semantically close to the expected equivalent in OGSA. The cost of the change might be partly offset by combining it with the change to use WS-RF. However, AstroGrid has no current plans to make this change.

Defining the CEA service to match OGSA lets CEA clients call other OGSA job-managers and lets workflow systems call CEA by treating it as an OGSA clone. The specializations that make CEA good for the virtual observatory would appear in three places only:

Notably, the specialization to astronomy must not appear in the middleware that places job steps on job-managers and which manages the orchestration of job-steps into workflows. It is only worth converging CEA and OGSA if this separation is maintained; otherwise, the two systems cannot use each others' parts.

For CEA to adapt easily to OGSA, CEA needs OGSA to meet the general requirements listed in the previous section plus some special requirements on the specification for the job-manager service.

OGSA-standard job-description language
There must be a job-description language such that each job and job-step can be defined completely. The part of the grid executing the job should never need to refer back to the end user in order to schedule and execute the job (this covers batch jobs and excluded jobs steered interactively via real-time connections). This removes the need for callbacks from OGSA to the virtual observatory.
Workflow descriptions
The job description in the current CEA describes a workflow as a job broken into a number of job-steps. A job-step is an atomic operation that can be sumitted to a job-manager service. The steps are arranged in a pattern of sequential and parallel operations. We need to keep this feature if we change CEA to match OGSA. Therefore, we require the OGSA job-submission language to be a workflow language.
Extensible job-description language
Each job description must contain fragments that set up the job-steps. These fragments are application specific. In the current CEA, they use XML vocabularies and we need to retain this feature. Therefore, the job language in OGSA must have extension points where we can add our own schemata.
Pluggable data grid
The current CEA interfaces directly to the astronomy data grid; OGSA will not, since the astronomy data grid will not change follow OGSA standards. We need the data grid to support data-intensive computing. Where the OGSA job-manager interface is implented by CEA, we can add the connection to the astronomy data grid as an extra feature; the location to be read and written in the data grid are then specified in the application-specific part of the job description. However, this feature would not work with pure-OGSA clients. Therefore, there may be scope for an API for pluggable data-grids with very basic features: essentially, copying whole files between data-grid locations and the local file-system as seen by the business logic. Some research is needed to see whether this pluggability is really feasible.

Caveat

We have little input so far from the working groups inside IVOA. None of the group leaders expressed an urgent need to link the virtual observatory with GGF-standard grids. Some of the group leaders noted a lack of introductory material about GGF standards: it is hard to understand how to use GGF standards and the full implications of adopting them.

Thus, this document reflects the views of the author rather than IVOA policy. I have tried to capture the mood and probable direction of IVOA. The detailed suggestions for OGSA and CEA are my own ideas, informed by discussions with colleagues in IVOA.


This is a static archive of the previous Open Grid Forum Redmine content management system saved from host redmine.ogf.org file /dmsf_files/7393?download=11587 at Fri, 04 Nov 2022 18:38:29 GMT