<br><font size=2 face="sans-serif"><b>Open Grid Forum: Data Format Description
Language Working Group</b></font>
<br>
<br><font size=2 face="sans-serif"><b>Weekly Working Group Conference Call</b></font>
<br><font size=2 face="sans-serif"><b>17:00 GMT, 09 Jan 2008</b></font>
<br>
<br>
<br><font size=2 face="sans-serif"><b>Attendees</b></font>
<br><font size=2 face="sans-serif">Mike Beckerle (Oco)</font>
<br><font size=2 face="sans-serif">Geoff Judd (IBM)</font>
<br><font size=2 face="sans-serif">Simon Parker (PolarLake)</font>
<br><font size=2 face="sans-serif">Ian Parkinson (IBM)</font>
<br><font size=2 face="sans-serif">Alan Powell (IBM)</font>
<br><font size=2 face="sans-serif">Steve Hanson (IBM)</font>
<br><font size=2 face="sans-serif">Suman Kalia (IBM)</font>
<br>
<br><font size=2 face="sans-serif"><b>Agenda</b></font>
<br><font size=2 face="sans-serif">1. OGF 22 in Cambridge, MA</font>
<br><font size=2 face="sans-serif">2. Level set on specification drafts</font>
<br><font size=2 face="sans-serif">3. Expression Language</font>
<br><font size=2 face="sans-serif">4. Nulls and defaults - can we drop
useNullForDefault?</font>
<br><font size=2 face="sans-serif">5. Other business</font>
<br>
<br><font size=2 face="sans-serif"><b>1. OGF22</b></font>
<br><font size=2 face="sans-serif">The next OGF conference will be held
February 25-29 in Cambridge, MA. As he is local, Mike is planning to attend
to represent DFDL. The working group should decide what we would like to
present at the conference, if anything, and Mike will enquire upon the
closing date for submissions. Could be Jan 11th?</font>
<br>
<br><font size=2 face="sans-serif"><b>2. Specification Drafts</b></font>
<br><font size=2 face="sans-serif">Mike circulated draft 30 of the DFDL
specification before Christmas, and had prepared a plan covering the contents
of the next three drafts. The objective of the plan was to guide the group
to the stage where the specification was not a limiting factor to progress
and that implementations could proceed with a reasonable expectation that
the specification would not change significantly. Steve mentioned that
IBM are attempting to assign remaining workitems internally, and wanted
to coordinate this with the other working group members to avoid duplication
of effort.</font>
<br>
<br><font size=2 face="sans-serif">Due to the demands of his new role,
Mike will need to pass some items that he had been hoping to tackle on
to other people. He suggested that editorship of the specification should
pass around the group with each draft, ideally to whoever would be making
the most significant changes in that draft.</font>
<br>
<br><font size=2 face="sans-serif">For the next draft, number 31, Steve
suggested that Alan might be an appropriate editor as he is working on
the expression language, which is a key subject for the next draft. Simon
would also like to own a draft and would consider this, but that he could
not commit in the meeting.</font>
<br>
<br><font size=2 face="sans-serif"><b>3. Expression Language</b></font>
<br><font size=2 face="sans-serif">The group has previously discussed difficulties
with forward/backward references in expressions. Mike observed that forward-referencing
expressions can occur in a DFDL schema but could only be used during unparse.
Discussing whether it is feasible to police this statically, Mike reckoned
that while it may be difficult to analyze an expression to see whether
it referred forward or not, this would probably be a decidable problem
(eg, follow the dfdl:outputValue chain). </font>
<br>
<br><font size=2 face="sans-serif">Steve asked how we should specify the
data type to be returned from an expression, there being two candidates:</font>
<br><font size=2 face="sans-serif">a) the XML Schema type of the DFDL property</font>
<br><font size=2 face="sans-serif">b) the 'resolved' data type of the DFDL
property as needed by the parser</font>
<br><font size=2 face="sans-serif">Take dfdl:length as an example. The
XML Schema type is 'string' because the field can accept numeric literals,
expressions, regular expression, etc. But the parser will always want an
integer. </font>
<br><font size=2 face="sans-serif">Agreed that an expression should return
the 'resolved' data type.</font>
<br>
<br><font size=2 face="sans-serif">Steve asked whether, in the property
descriptions, we should include the allowable return type from an expression.
Mike believed that we should, as it may be distinct from the DFDL type
for that field.</font>
<br><font size=2 face="sans-serif">So, the dfdl:length property description
in the spec needs to say exactly what the options are - eg, "a literal
integer, or an expression that resolves to an integer, or a regular expression
that resolves to an integer".</font>
<br>
<br><font size=2 face="sans-serif">Using the XSD "maxOccurs"
field as an example, which is normally an integer but may also be the token
'unbounded', Simon suggested that simply using the 'resolved' type may
not be sufficient and that a processor will need to be aware that, in some
cases, the result of an expression may not be the natural type. Mike concluded
that we would need to specify both types as above and also any 'distinguished
tokens'.</font>
<br>
<br><font size=2 face="sans-serif">Finally, should a DFDL engine automatically
cast an expression result to the 'resolved' type, or instead strictly enforce
the return type of the expression. The group felt the latter option to
be preferable. </font>
<br>
<br><font size=2 face="sans-serif">(Alan Powell joined the meeting)</font>
<br>
<br><font size=2 face="sans-serif"><b>4. Nulls and defaults</b></font>
<br><font size=2 face="sans-serif">Steve would like to review his previous
correspondance with Mike before discussing this further. It will be included
in the agenda for next week's meeting.</font>
<br>
<br><font size=2 face="sans-serif"><b>5. Property Precedence</b></font>
<br><font size=2 face="sans-serif">Geoff and Steve have been preparing
a proposal for precedence using a mind map. Steve will distribute this
initial proposal for wider review.</font>
<br>
<br><font size=2 face="sans-serif">(Mike Beckerle and Suman Kalia left
the meeting)</font>
<br>
<br><font size=2 face="sans-serif"><b>6. Entity references</b></font>
<br><font size=2 face="sans-serif">Alan has been looking at the use of
XML entity references to more easily allow non-printable characters to
be written into DFDL documents, and has distributed a proposal within IBM.
There are some issues around this at the moment (need DTD to define entities,
allowable characters in XML 1.0 docs). Alan is looking at these.</font>
<br>
<br><font size=2 face="sans-serif">This discussion in IBM had led to the
concept of a mechanism to easily represent arbitrary whitespace, which
is a common feature of text formats but which causes problems when modelling.
Simon has experience with this concept and will send Steve a description
of how PolarLake handle this..</font>
<br>
<br><font size=2 face="sans-serif">Steve suggested we could handle this
by allowing delimiters to be a <u>list</u> of allowable values, with the
first used as a default on unparse. (We already have this idea for dfdl:nullValue).
Simon observed that this could not handle arbitrary length whitespace.
Steve said that we should have entities that cover that - like <WSP>
and <OWSP> in IBM's WTX parser (the O meaning optional) - these are
extremely useful. So then you could say things like (ignore incorrect
entity syntax):</font>
<br>
<br><font size=2 face="sans-serif"> dfdl:separator ="x0Dx0A
x0D" </font>
<br>
<br><font size=2 face="sans-serif">meaning allow the separator to default
to CRLF but allow LF on its own.</font>
<br>
<br><font size=2 face="sans-serif">However, Steve also pointed out that
in the EDI data format the choice of delimiter comes from an expression,
adding to the complexity, because the allowable value of the delimiter
is then <value from expression> concatenated with <entity>.
Is that supported by current spec?. Eg:</font>
<br>
<br><font size=2 face="sans-serif"> dfdl:separator ="{..\delimiter}
{..\delimiter}x0Dx0A {..\delimiter}x0D" </font>
<br>
<br><font size=2 face="sans-serif">Simon wondered if we could deal with
this situation in a different way by perhaps handling it as 'delimiter
padding' and having a DFDL option to allow/trim it. But he cautioned that
we must avoid ambiguity - for example, to handle whitespace at the end
of a delimiter which is followed by data which allows whitespace. Steve
said that in that situation you have no choice but to explictly model the
whitespace and not use the arbitrary entities. </font>
<br><font size=2 face="sans-serif">Geoff thought that if we did go for
the trimming approach we may need to describe separate sets of rules for
whitespace handling for the markup region and for the data region.</font>
<br>
<br><font size=2 face="sans-serif">Steve will take an action to come up
with a proposal.</font>
<br>
<br><font size=2 face="sans-serif"><b>7. Other business</b></font>
<ul>
<li><font size=2 face="sans-serif">Steve would like to discuss a model
of ACORD AL3 length-prefixed data on the working group call, and will add
an item to next week's agenda. Mike and Geoff have been corresponding on
that.</font>
<li><font size=2 face="sans-serif">Within IBM, some changes have been proposed
to Mike's UML model of DFDL. This will be circulated to the working group
when IBM comments are complete.</font></ul>
<br><font size=2 face="sans-serif"><b>Meeting closed, 17:45 GMT</b></font>
<br>
<br>
<br><font size=2 face="sans-serif"><br>
Ian Parkinson<br>
WebSphere ESB Development<br>
Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK</font><font size=3 face="sans-serif"><br>
</font>
<br><font size=3 face="sans-serif"><br>
</font>
<hr><font size=2 face="sans-serif"><br>
<i><br>
</i></font>
<p><font size=2 face="sans-serif"><i>Unless stated otherwise above:<br>
IBM United Kingdom Limited - Registered in England and Wales with number
741598. <br>
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU</i></font>
<p><font size=2 face="sans-serif"><br>
</font><font size=3 face="sans-serif"><br>
</font>
<br>
<br><font size=3 face="sans-serif"><br>
</font>