[dfdl-wg] DFDL subset of XML schema
mike.beckerle at ascentialsoftware.com
mike.beckerle at ascentialsoftware.com
Wed Apr 13 09:30:51 CDT 2005
Yes, we're not really proposing that subset for the "final" language, rather
we're trying to be as minimalist as we can up front to facilitate the
I think it's fine if the prototype is a subset of what we specify for v1.0.
I just don't want it to be inconsistent with it. The prototype is
"non-normative" in w3c lingo, so there will be two documents. One describing
what the prototype does and implements, and the draft standard document can
Per your point 2 below. Let's split this into "single top level global
element" and the others. The others cause no issue far as I can tell. In
fact in our prototype it turns out that we don't even see any difference
between an element with anonymous type and an element reference. The code
just sees an element with name, type, etc. So those are easy to support.
Attributes are easily supported also. We need to add a flag bit to our
prototype that's all.
The single top level issue is a tiny bit deeper. In XML you can get away
with more than one global top level element because the documents are always
tagged to make it unambiguous which one of the possible global top level
elements describes the file. In DFDL we need specific information about
which of the possible global element declarations applies to the actual file
since there may be nothing in the data which makes it clear. We could do
this with an annotation that indicates "this is the one that actually
applies to the file". What we did in the prototype is just require there be
only one global element declaration to make this unambiguous.
The primary reason we left out element references is to be minimal. There's
nothing you can do with them that you can't do with a type definition and an
ordinary element declaration, so they seem simply unnecessary, and that
solved our ambiguity problem too.
Resusable groups - yes these are easily supported also modulo that they can
have separate minOccurs/maxOccurs at point of use. Again I think our
prototype never even sees them. The EMF XSD library essentially forward
substitutes them for us so our code never deals with them. Right now we'd
miss any additional min/max occurs information though, so that is a bug.
Re: hexbinary - I don't understand your use of hexbinary. Can you clarify?
Other simple types: yes we could put all of the date types in. We just chose
to keep it minimal. I left out the obscure 'date fragment' types because
I've never seen data containing things like that, but it's a very minor
thing. If you think we need them, then we need them. However I would argue
against putting in things for the sake of having more of XSD "covered".
Substitution groups - I agree these could be a pseudo choice construct, but
I prefer to make an XSD subset and be explicit about it being a subset
rather than go for a way to assign meaning to everything in XSD.
From: owner-dfdl-wg at ggf.org [mailto:owner-dfdl-wg at ggf.org] On Behalf Of
Sent: Wednesday, April 13, 2005 6:59 AM
To: dfdl-wg at gridforum.org
Subject: [dfdl-wg] DFDL subset of XML schema
Mike, looking at your proposal working draft, I don't agree with the DFDL
subset you are proposing. I think it is too restrictive. Specifically:
1) xsd:all - we have discussed this as part of the unordered mail exchange
last week so I think we now agree this is needed.
2) Single top-level global element, global attributes, element references,
attribute references. This prevents re-use.
3) Reusable groups. Ditto.
4) Simple type hexBinary. This is the MRM model's default mapping for binary
5) Other simple types. Some of these could be discussed - eg, the date
6) Substitution groups. We basically treat these a choice in non-XML data.
But I would be ok with deferring support post 1.0.
WebSphere Business Integration Brokers,
IBM Hursley, England
Internet: smh at uk.ibm.com
Phone (+44)/(0) 1962-815848
More information about the dfdl-wg