Notes from the DAIS Working Group Session at OGF22 ================================================== February 27th, 2008. Cambridge, Massachusetts. Status Update (Mario Antonioletti) ---------------------------------- Dave Pearson resigned at last OGF. Mario Antonioletti and Isao Kojima are the new chairs. Mario summarised the history of the group. It was noted that the rechartering process to include the RDF work had not completed. Mario will raise this with the Data Area Directors. It was noted that a Data AD was not present in the room. Mario summarised the current status of implementations of the DAI specifications: DAIX in OGSA-DAI and at Ohio State University DAIR in OGSA-DAI, GReLC, and AMGA OSU implementation of DAIX (Amy Krause) --------------------------------------- Amy Krause presented some slides on behalf of the group at Ohio State University who attempted to implement the DAIX specification using their Introduce tool to create services which plugged into their Mobius database. Issues included problem with the way that Introduce expected the definition of the datatypes, and that faults do not extend WS-BaseFaults. However the main issue was lack of available effort. A discussion around the issues of where the datatype definitions are placed ensued, and it was agreed that Mario would get in touch with OSU to determine the reasons why they did not just try altering the WSDL files to see if it worked. A WS-DAI implementation with OGSA-DAI (Elias Theocharopoulos) ------------------------------------------------------------- Elias presented the work done on a DAIX implementation. He summarised the WS-DAI operations supported and contrasted the OGSA-DAI and WS-DAI architectures. He noted the memory overhead and JDBC row retrieval issues. Are these major flaws in the spec which prevents efficient implementations? Mario asked whether you could count the number of results without retrieving them (execute SELECT COUNT(*) FROM ... on a database) to get the number of rows in the result set. But that would mean running the query which could be potentially expensive. Would it make sense to make some of the elements (e.g. result size) optional rather than mandatory to allow better scalability? Are these issues related to the specification or is it limitations in JDBC e.g. can you open multiple JDBC connections with the same state? Access to multiple resultsets from the same prepared statement is not allowed when using JDBC. Agreed that this needs to be clarified and a resolution agreed. There was a discussion of how potential changes would be fed back: via experience documents, via errata or via a spec revision. There was a discussion of where the NotAuthorizedFault is thrown (NotAuthorizedFault may be raised before the actual operation is invoked - depending on container implementation of security possible solution) and that the fault may be obselete. This should be discussed to see if it requires the fault to become optional and that a tracker should be raised. Everyone appears to violate the opacity of the reference parameters of a WS-EPR. Elias noted that there were ambiguities in the SubcollectionName URI over whether it should be absolute or relative. Need to see how others would choose to implement this. there was a discussion of whether things should be kept consistent, or whether they should reflect the expected behaviour. Elias note that there was an issue when RemovingSubcollectionsIssue with nonexisting parent collections should the parent tree be created automatically or throw a fault? Not clear from the spec. Mario felt that a fault should be thrown if intermediate collections do not exist as this is more likely to be sensible default. Better not to have implicit creation. On RemoveSubcollection faults, there isn't enough scope in the InvalidCollectionFault to indicate which collection caused the fault. Is it clear enough in use for the client that you can't express which collection didn't exist. Mario gave a counter example where you would not want to be told that something had been "deleted" when in fact it hadn't e.g. if sensitive information was there. There was a disagreement between "spec purists" who felt it was enough information in the fault, and developers who felt that this meant that in practice that you might have to constantly repeat operations to identify what went wrong. It was agreed that the fault structure should be discussed on the mailing list and on the telcon. On the AddDocument operation there was ambiguity on one of the responses. Missing parameter in invalid collection fault does not allow to provide more information about the fault issue on overwriting documents: not clear in the spec - what does the flag mean? The group agreed that the first "the document WAS overwritten" is the correct response. Elias went over a number of issues related the use of OGSA-DAI. RDF(S) Querying (Isao Kojima + Mirza Said) ------------------------------------------ Isao and Said summarised the work that had been done on the query and It was noted that use cases can be used as basis of use cases to be provided to SAGA. Mario asked whether the OGSA Glossary editor (Jem Treadwell) had been kept informed of the glossary work by UPM and AIST. This is important now that OGSA Data effort has reduced. This will be published in the motivational document for the RDF work. Mario suggested that they should also contribute to OGSA. Mario was confused about what is meant when things are considered synonyms. This will be clarified offline. It was noted that there is a bigger issue with the adoption of WS-Naming and the use of EPIs. This needs to be clarified with the OGSA group. RDF(S) Ontology (Miguel Esteban Gutierrez) ----------------------------------------- Miguel summarised the work that had been done on the ontology. Mario had an issue with UPM coming up with a set of write operations on their own (the read capabilities are provided by SPARQL). Mario asked whether for every RDF resource data resource you have to map your API to the underlying way of updating the RDF store. Mario noted the confusion arising from the use of the term "native interfaces" on slides which are actually meaning an abstraction to native interfaces (e.g. APIs for Jena and Sesame). Perhaps primitive interfaces would be better. A discussion arose about why, when this work on profiles seems very good, there weren't more people at the session. An issue is that the proposal was not mature enough when proposed to the semantic web community. The main added value is that we are providing service interfaces to RDF and also an abstract API to many RDF implementations. This latter work Mario felt could be very important to the community. This really needs to get better socialisation by the group. Could these operations be implemented as an API, rather than as web service operations. The group agreed that there was a lot still to be discussed! -------------------------------------------------------------- Notes taker: Neil P Chue Hong