[Xml-bin] An attempt to sort things out [long]

Stefan Zier Stefan.Zier@syntion.com
Wed, 18 Apr 2001 11:17:47 +0200


> First, I had the opportunity to speak with Charles Goldfarb 2 years ago
> regarding binary encoding of XML when I co-authored a chapter of the XML
> Handbook, 2nd ed., with him.  He mentioned that there was a similar effort
> to "binarize" SGML, which met with very limited success.  His gut feel,
> however, was that it was the complexity of SGML that limited their success
> and that XML _might_ have simplified things enough for this stuff to
> actually work.

Sounds very interesting. Do you think you can get him involved here or at
least get him to provide some pointers to that earlier work?

> 1) a binary structural encoding of straight XML (so we don't loose the
> "good stuff" of a textual representation of the actual data, including the
> byte-swapping problem, the small-value problem

So this would be on the lowest level of the model that I mentioned,
basically a lossless mechanis that conserves everything that existed in the
original document (even whitespace).

> 2) using the DOM/SAX/JDOM API set to interact with generic binary data.

Quite an interesting idea, actually. To recap your thought: This would mean
that textual XML and binary XML would be one amongst many other formats that
could be read via those APIs. So what would need to be designed here is not
a standard binary format but rather frameworks that enable developers to
easily plug DOM/SAX/JDOM APIs on top of any binary data. And the binary
structural encoding mentioned above would be one use case for the framework.

On the other hand the API specs themselves already pretty much provide that
framework. Other than some tools to make things easier for developers,
theres not much that can be added IMHO. What I do like about this thought is
that we could actually create multiple different binary encodings of XML,
each specialized for certain criteria (e.g. fast SAX parsing or random
access).

> <opinion>
> In summary, my belief is that the fall-out of XML will be the tools and
APIs
> that have been developed around this "new" way of handling data
> processing/interchange.  Whether the message is in a binary encoding or a
> textual encoding will be and invisible "flip-the-switch" option related to
> the transport (excuse me -- "session") layer.
> </opinion>

<idea id="554351">
Maybe we could come up with some kind of envelope to wrap any binary data
that identifies the specific parser to use. So for example the framework
could pick up an IIOP message wrapped in this envelope and find the
apropriate parser to create the SAX events the application is so desperate
to get. Does this make any sense to anybody? Or do I probably just need more
caffeine? *g*
</idea>

---------------------------------------
Stefan Zier
Software Developer
Syntion AG - http://www.syntion.com
Leonrodplatz 2 - 80636 Munich/Germany
Phone +49 89 52 30 45-0
Fax +49 89 52 30 45-20