LSG-RG session GGF8 Seattle, Thursday 4:00 PM-5:30 Attendance 31 Chairs: Abbas Farazdel, Dave Angulo Minutes prepared by: Dave Angulo * Read IPR * Summarized session from yesterday * RIck Stevens from ANL presented the rest of the material. * Reference Architectue & Requiremenjts * Model for Community Involvement: MPEG * Takes 3-4 years for MPEG to converge. This timeline is rather long for us, but it's important to have broad buyin from the community. They have a formal Request for Information and Request for Proposals. * Our survey is a start and our list of bio-grid projects is a start for the requirements. How do we get from this list and the group to a standard specification? * The core system needs to support core databases. An object model is needed. It needs to have language independence. It must have High performance interfaces and peer-to-peer synchronization. * Principal partners and stakeholders that need to be invited: (1) Biology and Biomedical community (2) Computer Science Community (3) Industry - buth users and technology providers (4) Agencies - NIH, NSF, DOE, ect. (5) Standards organizations (6) Professional societies.- e.g. Society for Computational Biology. We need to put together a database of these stakeholders * Start with LSG survey. This will give some scoping and requirements. Identify the stakeholders. Write a RFI to gather requirements. In 3-4 meetings take requirements and write a RFP to solve the requirements. Give groups 90 days to respond with proposals. Put together a core team to evaluate the proposals. Have a series of meetings and put together a set of proposals in a document that actually meet the requirements as a description of an open architecture. More than one Reference Implementations of the architecture. * The standard should be database independent and language independent. Should support a flexible data sharing model so that a group could combine public data with proprietary data and only publish the public data. * Scalability, security, and portability are important. * Scalability goals: scale to millions of genes and gene products, 100,000s of genomes, millions of phenomes, thousands of cooperating sites. * OLSG Services: Directory services, namespace-ontology services, brokering, channel services, computing services * OLSG Services sitting on top of OGSA services. * Question from Abbas Farazdel: Does this include data federating. * Response from Rick: Synchronization engines can update data or links to data, or kernel can have data modules. * Mohammad: Open Bio Grid already has this. Many open bio groups have a huge set of requirments. * Ray: This will have a positive impact on life sciences group. This will have a positive impact on groups outside of life sciences as well. Avaki and RSP have pieces of this technology. Rick responded that we don't want to take outside implementations that are sub-optimal for life sciences. * Karpjou: Need data standards. Rick: this is separate from data standards. Use existing ontologies. * Ray Hookway: Are we just talking about sequence data. Rick: There are 300+ databases. We want this to work as a design point for any kind of data: genome data, phenome data, molecular structure data. Ray: there are so many different data types that it's difficult to combine them. * Who will be leading this? Rick is happy to get it going, but would like to recruit a leader. Rick isn't convinced that the LSG has the credibility to get enough participation by the entire community. Rick wants to pursue this through LSG until people are convinced that it is doable or not. * Mohammad: We need the buy-in from the biologists who do this work day in and day out. It's his feeling that the people here are more computer scientists. Rick: he has lots of biologists at ANL and UofChicago that are behind this already. We have to get enough of the community engaged that the result has the respectability of the community. There are two dozen sites where most of the work is being done. We need to get at least half of those engaged. * Rob: one of the important standards groups in life sciences is I3C. * General concensus for regular conference calls (bi-weekly). The problem is timing because it's world wide. One comment: two calls per month - one matches Asian time zone, the other matches European times.