<br><font size=2 face="sans-serif">Comments in </font><font size=2 color=#008000 face="sans-serif">Green</font>
<br><font size=2 face="sans-serif"><br>
Suman Kalia<br>
IBM Toronto Lab<br>
WebSphere Business Integration Application Connectivity Tools <br>
Tel : 905-413-3923 T/L 969-3923<br>
Fax : 905-413-4850 T/L 969-4850<br>
Internet ID : kalia@ca.ibm.com</font>
<br><font size=1 color=#800080 face="sans-serif">----- Forwarded by Suman
Kalia/Toronto/IBM on 08/16/2007 12:46 PM -----</font>
<br>
<table width=100%>
<tr valign=top>
<td width=40%><font size=1 face="sans-serif"><b>"Simon Parker"
<simon.parker@polarlake.com></b> </font>
<br><font size=1 face="sans-serif">Sent by: dfdl-wg-bounces@ogf.org</font>
<p><font size=1 face="sans-serif">08/16/2007 12:28 PM</font>
<td width=59%>
<table width=100%>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td><font size=1 face="sans-serif"><dfdl-wg@ogf.org></font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">Re: [DFDL-WG] Minutes from 2007-08-08
Call - comments from Steve</font></table>
<br>
<table>
<tr valign=top>
<td>
<td></table>
<br></table>
<br>
<br>
<br><font size=2 color=blue face="Arial">Responses embedded below</font>
<br><font size=2 color=blue face="Arial"> Simon</font>
<br>
<hr><font size=2 face="Tahoma"><b>From:</b> Steve Hanson [mailto:smh@uk.ibm.com]
<b><br>
Sent:</b> 15 August 2007 12:23<b><br>
To:</b> Mike Beckerle<b><br>
Cc:</b> dfdl-wg@ogf.org; dfdl-wg-bounces@ogf.org; Simon Parker<b><br>
Subject:</b> [DFDL-WG] Minutes from 2007-08-08 Call - comments from Steve</font><font size=3><br>
</font>
<br><font size=2 face="sans-serif"><br>
I've spent today catching up with the recent DFDL spec discussions around
Simon's comments to v0.19. Some comments of my own on the content of these
and previous call minutes.</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
- General principle: The eventual consumers of DFDL will be users the majority
of whom will not be data modelling experts, that's certainly the experience
at IBM. Most see data modelling as a black art and find it difficult.
I think that an over-reliance on hidden elements is not going to go down
well. I would err on the side of caution here, and only if we are convinced
a property will be very rarely used should we remove it and replace by
a hidden element. </font><font size=3> </font><font size=2 color=blue face="Arial"><br>
[Simon] Accepted, providing we can specify everything. Ideally we'll publish
a rigorous, orthogonal language and a convenient, intuitive library with
controlled redundancy.</font><font size=3><br>
</font><font size=2 face="sans-serif"><br>
- Leading/Trailing Skip Bytes is a property intended to handle the byte
skipping added by compilers, over and above simple byte alignment rules.
The formulae for setting the values is beyond the ken of users to set manually,
it would invariably be done using an automated COBOL -> DFDL translator,
etc. I would not be too troubled if that went 'hidden'.</font><font size=3>
</font>
<br>
<br><font size=3 color=#008000>SKK -- For complex scenarios ( e.g. occurs
depending on elements in COBOL), different compilers follow quite complicated
algorithm to add slack bytes at the end or front of structures to properly
align array elements and current set of technologies for COBOL->DFDL
may not be able to extract this information from compilers/interpreters
as they may not be exposed through well defined interfaces in which case
the user may have to manually adjust the values for Leading/Trailing skip
counts. I would not vote for this attributes to be hidden, they are certainly
advanced properties used occasionally to cater to such complex scenarios.
</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
'finalTerminatorCanBeMissing' property. The rules for interpreting what
trailing markup actually means are complex and properties like this will
almost certainly be needed. </font>
<br><font size=2 color=#008000 face="sans-serif">SKK -- I tend to agree
with Steve</font>
<br><font size=2 face="sans-serif">(Aside: For Mike's second example, though,
where data of max length n is terminated by markup only if actual length
< n, wouldn't that be better expressed using a regular expression? finalTerminatorCanBeMissing
is too general, and could lead the parser to validly parse data where the
terminator was accidentally omitted).</font><font size=3> <br>
</font><font size=2 face="sans-serif"><br>
- Infix/prefix/postfix separators. I believe this should be retained. It's
in IBM WTX (Mercator) and I frequently have to apologise for the absence
of postfix in IBM MRM. When a user sees (eg) x,y,z it's easier for him
to comprehend that the comma after z is a postfix separator rather than
the terminator of the parent group. </font><font size=3><br>
</font><font size=2 face="Arial"><br>
- Simon had a comment on the removal of 'applies' which I haven't seen
discussed (</font><font size=2 color=blue face="Arial">"I find this
cumbersome. I suggest this alternative: drop ‘applies’ and ‘dfdl:format’,
insist on ‘dfdl:sequence’ and friends instead, and add local variants
like ‘dfdl:sequenceLocal’. For attribute shorthand, add boolean attributes
with the same name: sequenceLocal=”true” (optional, default false)."</font><font size=2 face="Arial">).
</font>
<br>
<br><font size=2 color=#008000 face="Arial">SKK - I am not comfortable
with using names like sequenceLocal, if we go with this you will quickly
compound the problem with <i>allLocal, choiceLocal</i> etc..The intent
here is to specify the scope and it is best expressed through one generic
property with different set of enumeration values identifying scope. </font>
<br>
<br><font size=2 face="Arial">I don't follow, the use of 'applies' is orthogonal
to whether you use dfdl:format or one of the specific elements such as
dfdl:sequence. </font><font size=2 color=blue face="Arial"><br>
[Simon] You're right, the ideas should be discussed separately. My hasty
comment throws it all in together.</font>
<br><font size=3> </font>
<br><font size=2 color=blue face="Arial">1 Replace this:</font>
<br><font size=2 color=blue face="Arial"> <dfdl:format
applies="hereOnly"></font>
<br><font size=2 color=blue face="Arial">with this:</font>
<br><font size=2 color=blue face="Arial"> <dfdl:formatLocal></font>
<br><font size=3> </font>
<br><font size=2 color=blue face="Arial">Why? Because 'applies' is a metaproperty
that doesn't describe the representation, and should be prominent. Also,
for brevity.</font>
<br><font size=3> </font>
<br><font size=2 color=blue face="Arial">2 Replace this:</font>
<br><font size=2 color=blue face="Arial"> <dfdl:format></font>
<br><font size=2 color=blue face="Arial">with one of these:</font>
<br><font size=2 color=blue face="Arial"> <dfdl:element>
<dfdl:sequence> <dfdl:complexType>...</font>
<br><font size=3> </font>
<br><font size=2 color=blue face="Arial">Why? For ease of validation and
interpretation, to make mistakes more obvious to human readers, and to
support more rigorous specification of the relationship between properties
and xsd constructs.</font>
<br><font size=3> </font>
<br><font size=2 face="sans-serif"><br>
Regards, Steve<br>
<br>
Steve Hanson<br>
WebSphere Message Brokers<br>
Hursley, UK<br>
Internet: smh@uk.ibm.com<br>
Phone (+44)/(0) 1962-815848</font><font size=3> <br>
<br>
</font>
<table width=100%>
<tr valign=top>
<td width=38%><font size=1 face="sans-serif"><b>Mike Beckerle <beckerle@us.ibm.com></b>
<br>
Sent by: dfdl-wg-bounces@ogf.org</font><font size=3> </font>
<p><font size=1 face="sans-serif">14/08/2007 14:23</font><font size=3>
</font>
<td width=61%>
<br>
<table width=100%>
<tr valign=top>
<td width=10%>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td width=89%><font size=1 face="sans-serif">dfdl-wg@ogf.org, "Simon
Parker" <simon.parker@polarlake.com></font><font size=3> </font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">Re: [DFDL-WG] Minutes from 2007-08-08
Call</font></table>
<br>
<br>
<table width=100%>
<tr valign=top>
<td width=50%>
<td width=50%></table>
<br></table>
<br><font size=3><br>
<br>
</font><font size=2 face="sans-serif"><br>
<br>
I forgot to clarify Simon's question on sp165.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
This was the 'finalTerminatorCanBeMissing" property. <br>
<br>
We considered the comment that this might be unnecessary. <br>
<br>
Use case: file of text format. Each "record" in the file is terminated
by a CRLF so sez the user. At the top level this file contains an array
of these records. <br>
<br>
The file might or might not have a CRLF at the end of the file because
human beings might have edited the file with a text editor, and either
inserted or neglected to insert this final CRLF.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
We want the file format to be legal with or without the final CRLF; however,
all prior CRLFs in the file must be present.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
So how to express this:</font><font size=3> </font><font size=2 face="sans-serif"><br>
1) CRLF is a terminator of the record</font><font size=3> </font><font size=2 face="sans-serif"><br>
2) CRLF is an occursSeparator of the enclosing array, records have no terminator.
We enclose the array in a sequence group where the array is followed by
a hidden "optional" (minOccurs=0 max=1) element of fixed="CRLF"
string value.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
Choice (1) requires that we have finalTerminatorCanBeMissing</font><font size=3>
</font><font size=2 face="sans-serif"><br>
<br>
Choice (2) is just modeling the behavior that is required directly via
hidden elements. This is tantamount to saying that this keyword is not
worth having because there is a way to model it already. This is true of
many keywords. If we deem this one too obscure, then we need to revisit
many others. (Leading/Trailing Skip Bytes is a good example. Trivially
represented by a hidden element). What are our criteria for inclusion?
Up until now our criteria have been to include things that existing systems
already have found a need for. However, existing systems don't have hidden
field capability.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
Note that this same missing final terminator issue can come up not only
with End-of-data, but with any bounded size structure.</font><font size=3>
</font><font size=2 face="sans-serif"><br>
<br>
E.g., suppose we say that an array has occursUnits="bytes" and
occursPath="874". Then it is 874 bytes long. The array elements
can be terminated by a particular data. E.g., semicolon. For the same reasons
as the CRLF example above, we want to be able to tolerate a missing final
semicolon before the end of the 874 bytes. In effect the byte-length-limit
creates an implicit "end-of-data" for a sub-stream consisting
of just those bytes. <br>
<br>
Conclusion: finalTerminatorCanBeMissing seems to be useful enough and comes
up often enough that I think the keyword is worthwhile.</font><font size=3>
</font><font size=2 face="sans-serif"><br>
<br>
Implication: we should create a list of keywords or enumerated values for
properties that we think are in the grey area where perhaps we want
to drop them. Here's some candidates: byteOrderMarkPolicy, leading/trailingSkipBytes.
Both these can be modeled readily as hidden elements. There are probably
others.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
Mike Beckerle<br>
STSM, Architect, Scalable Computing<br>
IBM Software Group<br>
Information Platform and Solutions<br>
Westborough, MA 01581<br>
direct: voice and FAX 508-599-7148<br>
assistant: Pam Riordan <br>
priordan@us.ibm.com
<br>
508-599-7046</font><font size=3><br>
<br>
<br>
</font>
<table width=100%>
<tr valign=top>
<td width=37%><font size=1 face="sans-serif"><b>Mike Beckerle/Worcester/IBM</b></font><font size=3>
</font>
<p><font size=1 face="sans-serif">08/14/2007 08:40 AM</font><font size=3>
</font>
<td width=62%>
<br>
<table width=100%>
<tr valign=top>
<td width=13%>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td width=86%><font size=1 face="sans-serif">"Simon Parker" <simon.parker@polarlake.com></font><font size=3>
</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td><font size=1 face="sans-serif">dfdl-wg@ogf.org</font><font size=3>
</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">Re: [DFDL-WG] Minutes from 2007-08-08
Call</font><a href=Notes://d01ml259/85256FDB00077D54/DABA975B9FB113EB852564B5001283EA/BD9CFD7CA73D7AFD852573360052302A><font size=3 color=blue><u>Link</u></font></a></table>
<br><font size=3><br>
</font>
<br>
<table width=100%>
<tr valign=top>
<td width=50%>
<td width=50%></table>
<br></table>
<br><font size=3><br>
</font><font size=2 color=blue face="Arial"><br>
<br>
In conjunction with the annotated document these notes are clear, except
for 'sp165'. Perhaps someone will recapitulate the discussion briefly at
Wednesday's conference. I think only three annotations remain:</font><font size=3>
</font><font size=2 color=blue face="Arial"><br>
<br>
sp167 Absent and missing (expanded discussion on the wiki already)</font><font size=3>
</font><font size=2 color=blue face="Arial"><br>
<br>
This will be a major topic on a call.</font><font size=3> </font><font size=2 color=blue face="Arial"><br>
<br>
sp172 separatorType="infix"</font><font size=3> </font><font size=2 color=blue face="Arial"><br>
<br>
I'm happy to drop this strange stuff about separatorType=prefix or postfix
and just say separator means infix. However, I would note that at least
two major integration products (IBM WebSphere Transformation Extender -
formerly Mercator, and Microsoft Biztalk, have this concept, so we may
end up putting it back in. Presumably MS copied the earlier Mercator style,
or both got it from common requirements in some EDI standard.</font><font size=3>
</font><font size=2 color=blue face="Arial"><br>
<br>
sp173 defaultWhenMissing (expanded discussion on the wiki already)</font><font size=3>
<br>
<br>
Same topic as sp167 above. Will have a call topic to discuss. <br>
</font><font size=2 color=blue face="Arial"><br>
I've added another contribution to the wiki discussion on 'require'.</font><font size=3>
</font><font size=2 color=blue face="Arial"><br>
<br>
This seems to be at resolution I think, which is that we can express this
using assertions. The general style of using DFDL to describe what fixed-data
syntactic constructs look like is a good one.</font><font size=3> </font><font size=2 color=blue face="Arial"><br>
<br>
However, I've amended the Wiki thread on this with a further issue for
group consideration. See bottom of page: <br>
https://forge.gridforum.org/sf/wiki/do/viewPage/projects.dfdl-wg/wiki/Require?_message=1187096164776</font><font size=3>
<br>
</font><font size=2 color=blue face="Arial"><br>
The 'length and occurs' proposal is an improvement, though I still have
reservations to discuss; likewise the 'opaque data' proposal.</font><font size=3>
</font><font size=2 face="sans-serif"><br>
<br>
For a call, this week or soon. I will send out an agenda.</font><font size=3>
</font><font size=2 face="sans-serif"><br>
<br>
Mike Beckerle<br>
STSM, Architect, Scalable Computing<br>
IBM Software Group<br>
Information Platform and Solutions<br>
Westborough, MA 01581<br>
direct: voice and FAX 508-599-7148<br>
assistant: Pam Riordan <br>
priordan@us.ibm.com
<br>
508-599-7046</font><font size=3><br>
<br>
<br>
</font>
<table width=100%>
<tr valign=top>
<td width=51%><font size=1 face="sans-serif"><b>"Simon Parker"
<simon.parker@polarlake.com></b> <br>
Sent by: dfdl-wg-bounces@ogf.org</font><font size=3> </font>
<p><font size=1 face="sans-serif">08/13/2007 10:56 AM</font><font size=3>
</font>
<td width=48%>
<br>
<table width=100%>
<tr valign=top>
<td width=15%>
<div align=right><font size=1 face="sans-serif">To</font></div>
<td width=84%><font size=1 face="sans-serif"><dfdl-wg@ogf.org></font><font size=3>
</font>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">cc</font></div>
<td>
<tr valign=top>
<td>
<div align=right><font size=1 face="sans-serif">Subject</font></div>
<td><font size=1 face="sans-serif">Re: [DFDL-WG] Minutes from 2007-08-08
Call</font></table>
<br><font size=3><br>
</font>
<br>
<table width=100%>
<tr valign=top>
<td width=50%>
<td width=50%></table>
<br></table>
<br><font size=3><br>
<br>
<br>
<br>
<br>
</font><font size=2 color=blue face="Arial"><br>
In conjunction with the annotated document these notes are clear, except
for 'sp165'. Perhaps someone will recapitulate the discussion briefly at
Wednesday's conference. I think only three annotations remain:</font><font size=3>
</font><font size=2 color=blue face="Arial"><br>
<br>
sp167 Absent and missing (expanded discussion on the wiki already)</font><font size=3>
</font><font size=2 color=blue face="Arial"><br>
sp172 separatorType="infix"</font><font size=3> </font><font size=2 color=blue face="Arial"><br>
sp173 defaultWhenMissing (expanded discussion on the wiki already)</font><font size=3>
<br>
</font><font size=2 color=blue face="Arial"><br>
I've added another contribution to the wiki discussion on 'require'.</font><font size=3>
<br>
</font><font size=2 color=blue face="Arial"><br>
The 'length and occurs' proposal is an improvement, though I still have
reservations to discuss; likewise the 'opaque data' proposal.</font><font size=3>
<br>
</font><font size=2 color=blue face="Arial"><br>
Regards,</font><font size=3> </font><font size=2 face="Arial"><br>
Simon</font><font size=3> <br>
<br>
<br>
</font>
<hr><font size=2 face="Tahoma"><b>From:</b> dfdl-wg-bounces@ogf.org [mailto:dfdl-wg-bounces@ogf.org]
<b>On Behalf Of </b>Mike Beckerle<b><br>
Sent:</b> 08 August 2007 18:00<b><br>
To:</b> dfdl-wg@ogf.org<b><br>
Subject:</b> [DFDL-WG] Minutes from 2007-08-08 Call</font><font size=2 face="sans-serif"><br>
<br>
<br>
MikeB, Geoff Judd, Alan Powell attended.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
Continued through SP's comments.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
sp37 - got it.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
sp45 - agree. This whole part to be rewritten.</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
sp115 - ok. strict and "lax" as enums. No built-in default -
we never use defaults in the processor itself. Only in the predefined formats.</font><font size=3>
</font><font size=2 face="sans-serif"><br>
<br>
sp118 - ok</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
sp123 - Proposal to simplify length, lengthKind, lengthUnits, and also
occursKind, occursPath, occursPathUnits needed. (along the lines of byteCount,
itemCount, length='delimited' enum, etc.)</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
sp154 - Need specific proposal to eliminate hexBinary and use what for
opaque (consider also string with encoding='bytes'. ) Or introduce
a dfdl:byteString type or dfdl:opaque type. (derived type - just a standard
name).</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
<br>
sp158 - see sp123</font><font size=3> </font><font size=2 face="sans-serif"><br>
<br>
sp165 - needed to have composition property for enclosing groups and or
end-of-data. Regexp doesn't fix this. <br>
<br>
<br>
Mike Beckerle<br>
STSM, Architect, Scalable Computing<br>
IBM Software Group<br>
Information Platform and Solutions<br>
Westborough, MA 01581<br>
direct: voice and FAX 508-599-7148<br>
assistant: Pam Riordan <br>
priordan@us.ibm.com <br>
508-599-7046</font><tt><font size=2><br>
--<br>
dfdl-wg mailing list<br>
dfdl-wg@ogf.org<br>
http://www.ogf.org/mailman/listinfo/dfdl-wg</font></tt><font size=3> </font><tt><font size=2><br>
--<br>
dfdl-wg mailing list<br>
dfdl-wg@ogf.org<br>
http://www.ogf.org/mailman/listinfo/dfdl-wg</font></tt><font size=3> </font><font size=3 face="sans-serif"><br>
</font><font size=3><br>
</font><font size=3 face="sans-serif"><br>
</font><font size=3><br>
</font>
<hr><font size=2 face="sans-serif"><i><br>
</i></font>
<p><font size=2 face="sans-serif"><i>Unless stated otherwise above:<br>
IBM United Kingdom Limited - Registered in England and Wales with number
741598. <br>
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
3AU</i></font><font size=3> </font>
<p><font size=3 face="sans-serif"><br>
</font><font size=3><br>
<br>
</font><font size=3 face="sans-serif"><br>
</font><tt><font size=2>--<br>
dfdl-wg mailing list<br>
dfdl-wg@ogf.org<br>
http://www.ogf.org/mailman/listinfo/dfdl-wg</font></tt>