14:00-15:30 PGI Session 1 ------------------------- Project representation; - UK NGS David Wallom (DW) - EDGI Etienne Urbah (EU) - IGE Steve Crouch (SC) - SAGA Andre Merzky (AM) - EMI Jon Kiier Nielsen (JKN) - XSEDE, Genesis II - A Grimshaw (AG), Mike Savaro (MS) - RENKEI Katzushige Saga (KS) - UNICORE Daniel Mallmann (DM) - QCG-Computing Marius Mamonski (MM) Minutes: Steve Crouch New actions: [AG]: will write a description on the difrerences between these two approaches (new port type for suspend/resume, or achieved through EPR) Minutes AG: One big spec or multiple specs/iterations? What should we do? SC: Not one big spec. 200 reqs in one spec would not be scalable in terms of developments and very impractical. AG agrees - just too difficult. DW: we don't have all spec experts required in the room to put into this. Not enough represtnation. We shoulnd't do this. DK: any other representatives that should be here? AG: 5th F2F PGI meeting - stakeholders have come in and out over time. Some big groups (interested in a big spec) have been represented here quite a bit. AG: do we start with what we have, or start again? We can leverage what our experiences and what is there if we go with what is there. AM: should be split the work into various areas with various specs? AG: agreement to use GLUE2 resource properties, replace in JSDL resource properties - Etienne. BES: PGI wanted state model changes (primary thing wanted), vector operations (I see as trivial). Job management - JSDL for description, BES for management. GridFTP, Byte/IO, put these play into how the data is managed in JSDL - a secondary concern. EU: HPC file staging - ppl interested in how to specify file staging. AG: we want to do this, do we change for JSDL or not. Also the SPMD, ParamSweep profile. EU: we should reuse where possible, where no problems exist with doing so. AG: big difference in profiling and extending JSDL, than doing something new (new namespaces, whole doc, etc). Do tweaks and profiles where possible. AG: look at using existing specs, and building on these, rather than restarting - a consensus? DW: EGI virtualisation TF is all about profiling existing specs. Alan E Sill: for security group fully intends to take what works, what exists, working fast using this approach. [Consensus agreed on doing this] AG: PGI in Catania - we can agree on func bits forever, but security has to work too. Don't drop security on the floor JJ: need to find right people to engage with the security ???? group, to develop in areas that overlap with PGI. AES: PGI needs to state clearly what needs to be done to handoff to other groups for the work. AG: take existing PGI work, as inputs to the process. Authn/z/delegation pass to security group. We have 2 profiles for delegation, for each style of delegation: message-level delegation (SAML assertions) and transport layer. Problems in PGI - not much compromise possible with security, people wouldn't change transport level and vice versa (hence two profiles). AES: all authz implementations use SAML or XACML. AG: not always - e.g. Genesis II uses ACLs. AES/JJ: profile work to decide on SAML assertion for VO-using sites that internally s/w does its own thing. AG: transport of auth mechanism already profiled in WSI-BSP. Semantics need to be profiled. Orthogonal for Access Control policy; that's completely site dependent. DW: security not original focus, what's desired session outcome? AG: decide on move forward on modifying existing specs - have consensus here. Continue along the JSDL/BES path. Or reject, and go another way. If we use BES - 2 approaches. 1) change state machine to meet requirement to hold jobs at various data-related stages; 2) history about BES - some historic bad decisions, should we revisit those decisions. You get EPR back, subsequent management done through factory port type - some thought one service. Rexamining this would mean changing this. EU: three sequence diagrams about the endpoint for job management - is there a bottleneck on only one endpoint? BES issue. DM: interested in param sweeping. Don't drop it. DW: writing profiles. AG: agree on work items for group. DW: we should have experiences doc on interop work done so far? AG: already written - SC. AG: - data staging FSE+ - param sweep - state model extensions (getstatus) - vector ops - JSDL changes EU: many of these profiles already exist - should we build on them or make something different? AG: motion to change JSDL to reflect JSDL group changes in resource profiling (through GLUE2). DW: we need something coherent and whole, would like to see something that fits all things all together e.g. GLUE fits with BES fits with JSDL. With this top-level doc referencing other profile docs. DM: some work needs to be done to move towards this position first. DW: agreed. AES: e.g. CDMI interesting as a spec for this standpoint. DW: Virtualisation session - everything broken up into smaller parts. But we keep revisiting this, but need to approach this sensibly so it all comes together again at the top. MM: extension related to BES used in production(?) - how to deal with this? EU/AG: PGI use cases and requirements analysis done last year - painful. EU: moving forward, pick one of the above listed items. AG: jsdl changes need to be done, really want to do data staging. BES without data staging is more or less useless. You need to understand protocols. DM: can we bring up faults in JSDL data staging? AG: yes, perhaps in next session? AG: need new working group chairs decided in Salt Lake City - Michel Drescher, Andrew Grimshaw. AG: what problems are people having with BES? We changed it support RNS (small change). MM: change - added things to factory attributes. Manual data staging, other changes. AG: have manual staging as a separate interface. Might want activity status achieved through one resource, but forces one service to take all requests. AG: leap into JSDL stuff (not good, JSDL have their own ideas), or leap into BES stuff - e.g. status requests through a single endpoint. AM: should go through requirements made to PGI. AG: for BES - manual data staging. Current state model says you can't make new states, only substates. PGI req from some members stated that job could be submitted, but in blocked state until explicit 'go' telling it to move into next state. Support through new port type. AM: add a new port type - don't break it badly. Some clients wouldn't know about suspend/resume substates. EU: full suspend/resume state model is complicated. AG: define new port type which defines what happens for each substate in BES. Circulate this idea. Have this as a compliance target in one of the uber-high level profiles. DM: mentioned already in OGSA-DMI - would be good for us. EU: existing state model of BES good enough to be profiled. But we need a valid state change from Pending to Failed. AG: one other thing - to get to PGI use case, have it going from Pending to Running:Suspended initially. We would need to do a little bit more changes for this. JSDL might not like it but, can put it in a JSDL element to state 'suspended'. But not a clean way since JSDL should only be a job description, not management. AG: motion - we will use this basic state model, ask JSDL to include 'start in this suspended' state, or new port type tht controls states a job is in. DM: yes, becuase being able to suspend and resume at off peak times would be useful for OGSA-DMI. AG: ok, we define a new port type with suspend and resume. It is resolved! It will have 'suspend' with an EPR (or a vector of EPRs), and similarly 'resume' with an EPR (or vector of EPRs). EU: this relates back to the recent PGI requirements doc I wrote for these requirements. AM: ppl will complain hold and suspend mean same thing. AG: suspend is at BES level, hold gets put in JSDL. EU: rename hold as suspend? DW: it'll do whatever LRMS supports at back end. Which support hold? EU/AG: if it can't do suspend, it faults. AG: could put suspend/resume on the activity itself. Instead of new port type on BES, instead make call through EPR Action [AG]: will write a description on the difrerences between these two approaches (new port type for suspend/resume, or achieved through EPR) AG: feels like we've made good progress today! DW: those who are not here should have limited voice.