<br><font size=2 face="sans-serif">Sorry I couldn't make the call. Some
comments:</font>
<br>
<br><font size=2 face="sans-serif">a) we need both WSP and OWSP if DFDL
delimiter properties can only specify a single value. If they can specify
a list of values then you can get away with only needing WSP</font>
<br><font size=2 face="sans-serif"> eg, </font><font size=2 face="Courier New">dfdl:terminator="@
@%WSP;"</font>
<br><font size=2 face="sans-serif">b) if we make WSP mean a single white
space character, we need a second entity for multiple white space characters.</font>
<br>
<br><font size=2 face="sans-serif">It doesn't look like you got round to
discussing the other items I sent in (below)? Let's do that next call.</font>
<br>
<br><font size=2 color=blue face="sans-serif">1) One way to handle the
situation where the terminator can vary is to allow the DFDL markup properties
(dfdl:terminator, dfdl:separator, etc) to be lists, just like we already
do for dfdl:nullValues. (IBM's WTX has this capability).</font><font size=3 color=blue>
</font><font size=2 color=blue face="sans-serif"><br>
<br>
2) We've allowed the prefix of a prefixed length to be explicitly described
as a non-event field using dfdl:lengthPrefixType. Should we permit this
for markup properties? Instead of supplying a list of possible values,
you supply a simple type with enums for the values. This could be viewed
as an alternative/complementary to 1). There is a limitations - because
we are using XSDL enumeration facet, we are constrained by its syntax so
I don't see how we could use our own entity scheme or expressions. Also,
I suspect that enums are inherently unordered so we'd need a way of saying
which to use on output (use an element of simple type and use XSDL default
attribute?). Lastly, we should not force a user to model an initiator
as an element/type - most users just see it as a piece of text so just
entering the value must still be allowed. <br>
<br>
3) Let's say my delimiter is dynamically defined at the start of the data,
like EDI allows. We would handle that in DFDL using an expression or variable.
However, EDI also allows random white space to appear after the delimiter.
Can our expression/entity syntaxes handle this? Does this preclude
use of 1) or 2)? </font><font size=3 color=blue> </font><font size=2 face="sans-serif"><br>
</font>
<br><font size=2 face="sans-serif">Regards, Steve<br>
<br>
Steve Hanson<br>
WebSphere Message Brokers<br>
Hursley, UK<br>
Internet: smh@uk.ibm.com<br>
Phone (+44)/(0) 1962-815848</font>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>Ian W Parkinson/UK/IBM@IBMGB</b>
</font>
<br><font size=1 face="sans-serif">Sent by: dfdl-wg-bounces@ogf.org</font>
<p><font size=1 face="sans-serif">24/01/2008 16:19</font>
<td width=59%>
<table width=100%>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td><font size=1 face="sans-serif">dfdl-wg@ogf.org</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">[DFDL-WG] DFDL: Minutes from OGF WG
call, 23 Jan 2007 *CORRECTED*</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><font size=2 face="sans-serif"><br>
A small correction. with thanks to Simon - it was Steve (rather than Simon)
who had previously attracted a reasonable audience at the OGF conference.</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><br>
Ian</font><font size=3> <br>
<br>
<br>
</font><font size=2 face="sans-serif"><b><br>
Open Grid Forum: Data Format Description Language Working Group</b></font><font size=3>
<br>
</font><font size=2 face="sans-serif"><b><br>
Weekly Working Group Conference Call</b></font><font size=3> </font><font size=2 face="sans-serif"><b><br>
17:00 GMT, 23 Jan 2008</b></font><font size=3> <br>
<br>
</font><font size=2 face="sans-serif"><b><br>
Attendees</b></font><font size=3> </font><font size=2 face="sans-serif"><br>
Mike Beckerle (Oco)</font><font size=3> </font><font size=2 face="sans-serif"><br>
Simon Parker (PolarLake)</font><font size=3> </font><font size=2 face="sans-serif"><br>
Ian Parkinson (IBM)</font><font size=3> </font><font size=2 face="sans-serif"><br>
Alan Powell (IBM)</font><font size=3> <br>
</font><font size=2 face="sans-serif"><b><br>
Apologies</b></font><font size=3> </font><font size=2 face="sans-serif"><br>
Steve Hanson (IBM), Suman Kalia (IBM)</font><font size=3> <br>
</font><font size=2 face="sans-serif"><b><br>
1. OGF22</b></font><font size=3> </font><font size=2 face="sans-serif"><br>
The DFDL session at OGF22 is now booked for the Monday afternoon, and Mike
has registered to attend. Mike will present our updated status, and Alan
promised to upload the last set of presented slides to GridForge so that
Mike can update them. Alan asked whether we should attempt to drum up interest
in the DFDL session to encourage attendence; Simon thought that advertising
may not make much difference and that Steve had a reasonable audience when
he presented.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><b><br>
2. Specification drafts</b></font><font size=3> </font><font size=2 face="sans-serif"><br>
Steve and Alan had previously assigned ownership of individual items from
Mike's plan of contents for the next few drafts. Alan will assemble the
next draft, due at the end of the month, and asked for input as soon as
possible.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
Looking at the plan for the next, "vX+1", draft, the group reported
the following status:</font><font size=3> </font>
<ul>
<li><font size=2 face="sans-serif"><b>Nulls/default/optionals</b> - Mike
reported no update.</font><font size=3> </font>
<li><font size=2 face="sans-serif"><b>Description of schema components</b>
- Simon is still working on this.</font><font size=3> </font>
<li><font size=2 face="sans-serif"><b>Regular expressions for lengths</b>
- Alan reported no progress.</font><font size=3> </font>
<li><font size=2 face="sans-serif"><b>Expression language</b> - Alan will
shortly distribute a new version of the proposal for review.</font><font size=3>
</font>
<li><font size=2 face="sans-serif"><b>valueCalc</b> - Mike is still to
write this.</font><font size=3> </font>
<li><font size=2 face="sans-serif"><b>Property precedence</b> - Following
a discussion on the call last week, please provide review comments. Mike
will add this to the agenda for next week.</font><font size=3> </font>
<li><font size=2 face="sans-serif"><b>Entities</b> - Alan's recent proposal
is to be discussed on the current call.</font><font size=3> </font>
<li><font size=2 face="sans-serif"><b>White space handling</b> - Discussion
is ongoing, and Steve is to make a proposal.</font></ul><font size=2 face="sans-serif"><br>
The plan calls for subsequent versions of the specification, including
the following items with status:</font><font size=3> </font>
<ul>
<li><font size=2 face="sans-serif"><b>Supplements</b> - Steve is working
to update the supplements</font><font size=3> </font>
<li><font size=2 face="sans-serif"><b>Speculative parsing</b> - IBM has
internally been discussing and reviewing WTX function, though no documentation
presently exists covering this.</font></ul><font size=2 face="sans-serif"><b><br>
3. UML diagrams</b></font><font size=3> </font><font size=2 face="sans-serif"><br>
Simon is revising the UML diagrams which describe the DFDL schema components.
The previous meeting minutes included a number of comments on these diagrams,
and the group took this opportunity to look at some of those comments:</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><i><br>
"...I think it would be better to use the open source XML schema model
as source model and show relationship of DFDL Annotations attached to the
XSD schema model"</i> - Mike noted that DFDL makes use of annotations
on objects which are absent from the XSD schema model, and hence that it
may be unnatural to base the DFDL schema model directly on the XSD model.
Simon suggested that it would be cleanest to describe a modified version
the XSD model including those XSD elements that we need to annotate, and
use this as a basis for the DFDL model.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><i><br>
"The current diagram suggests that 'variable definition' can both
be part of a format base or as a standalone annotation (outside of a format).
Is this true?" </i>- Mike suggested that variable definitions don't
have to be part of a format block: so, yes, this is true.</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><br>
Mike agreed to respond further to the set of comments by email.</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><b><br>
4. Review of Entities proposal</b></font><font size=3> </font><font size=2 face="sans-serif"><br>
Alan has distributed a proposal covering entities in DFDL, intended to
allow characters which are disallowed by XML1.0 (or XML1.1) to be included
in DFDL schemas. These follow a similar syntax to XML, using % instead
of & as an escape, with an additional mechanism for specifying raw
data. This latter is intended to supplant the escaping mechanism described
in current versions of the specification (which also uses % as an escape).</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><br>
The group felt that the description of the raw data entities should not
be cast in terms of characters and character sets, but rather in terms
of bytes. If treated as characters, schemas may need to be written when
moving from single-byte to double-byte character sets; further, this incorrectly
implies some codepage conversion is involved.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
The proposal also introduces a list of predefined names for certain common
control characters. Mike asked whether these are the existing XML names
- Alan replied that XML does not define names for control characters. </font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
Ian asked how we should represent the literal % character in strings given
this form of escaping. The present draft of the specification uses "%%"
to handle this; Simon suggested a string like "%pc;". The meeting
felt that %% might be marginally preferable.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
Finally, the proposal defines some labels which aim to reduce the complexity
of dealing with whitespace and newlines. The %NL; entity represents a newline
on "the target platform" - Mike observed that DFDL presently
does not have a concept of a target platform. Alan felt it important that
a single DFDL schema be able to generate output documents targetted at
different platforms. Mike proposed that we introduce a new property, "generatedNewLine",
which describes the meaning of %NL; during unparse, and that %NL; should
be tolerant of any common new line representation during parse. The group
discussed whether this could instead be handled using a list of optional
new line values, however this would not support schema portability. Simon
suggested we introduce another new property to mean that %NL; should be
the conventional new line representation on the platform on which an engine
is running, however Mike pointed out that this simply requires appropriate
configuration of the generatedNewLine property.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
%WSP; and %OWSP; are introduced to mean any whitespace, and optional whitespace.
This will be useful in describing some formats which allow arbitrary whitespace,
such as MIME. Mike pointed out that we could model such whitespace using
hidden fields, but that these entities may make a schema clearer. PolarLake
have found that only one such label is necessary, which means, "one
or more whitespace characters", and that this needs only to be made
available as a delimiter - Mike agreed that this label may represent a
special type of delimiter rather than a general purpose entity. Alan would
like to work through the potential use cases to see if we can restrict
it in this fashion, and will update the proposal to specify that these
relate to just one character. Simon suggested we could introduce an extra
label, perhaps %WPS*; to match multiple whitespace characters.</font><font size=3>
<br>
</font><font size=2 face="sans-serif"><b><br>
Meeting closed, 18:15</b></font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
<br>
Ian Parkinson<br>
WebSphere ESB Development<br>
Mail Point 211, Hursley Park, Hursley, Winchester, SO21 2JN, UK</font><font size=3><br>
</font><font size=3 face="sans-serif"><br>
</font><font size=3><br>
</font>
<hr><font size=2 face="sans-serif"><i><br>
</i></font>
<p><font size=2 face="sans-serif"><i>Unless stated otherwise above:<br>
IBM United Kingdom Limited - Registered in England and Wales with number
741598. <br>
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU</i></font><font size=3> </font>
<p><font size=3 face="sans-serif"><br>
</font><font size=3><br>
<br>
</font><font size=3 face="sans-serif"><br>
</font><tt><font size=2>--<br>
dfdl-wg mailing list<br>
dfdl-wg@ogf.org<br>
http://www.ogf.org/mailman/listinfo/dfdl-wg</font></tt>
<br><font size=3 face="sans-serif"><br>
</font>
<br><font size=3 face="sans-serif"><br>
</font>
<hr><font size=2 face="sans-serif"><br>
<i><br>
</i></font>
<p><font size=2 face="sans-serif"><i>Unless stated otherwise above:<br>
IBM United Kingdom Limited - Registered in England and Wales with number
741598. <br>
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU</i></font>
<p><font size=2 face="sans-serif"><br>
</font><font size=3 face="sans-serif"><br>
</font>
<br>
<br><font size=3 face="sans-serif"><br>
</font>