[DFDL-WG] 7-bit ascii packed together
mbeckerle.dfdl at gmail.com
Fri Oct 19 15:44:19 EDT 2012
I have a data format in front of me that has 64 7-bit ASCII characters, but
the format has them bit-packed, i.e., 448 = 7 * 64 bits, so ....the
character codes aren't octet/byte aligned.
Furthermore, the 'string' either uses up the entire 64 character maximum
length OR it has a terminating character which is a 0x7F character code.
I believe I was the advocate for a position that character codes should
always be 8-bit aligned. That would be because I had never seen anything
I am told there are also 6-bit ascii-variations, similarly packed together
to save space.
BTW: This occurs in a specific US MIL STD message header format, so it's
not like it's some obscure unused corner case.
Right now, the best I think I can do is to model this data not as a string
at all, but as an array of integers, each one having 7-bit length, and not
aligned (that is, aligned to 1-bit). Doing that I can use
occursCountKind='parsed', and an assertion to deal with the optional
termination by 0x7F value.
To handle this as a string, we'd need to be able to specify that the
character codes are not aligned, and the width of the bit-fields making up
each character code. Or I suppose we could just say this is a special kind
of character set encoding "ASCII-7-bit-packed" or something.
Having that, I could deal with the termination via a choice of either the
terminated flavor, or the fixed length flavor (which excludes the
terminator) by way of a choice of two strings each having a
Mike Beckerle | OGF DFDL WG Co-Chair
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the dfdl-wg