[REBOL] ANN: xmlparse.r "a more compliant XML parser"
From: gavin::mckenzie::sympatico::ca at: 13-Jul-2001 19:11
Folks,
Here's my first crack at trying to get my own XML parser up to a quality
where I feel prepared to release it.
The benefits of this parser over REBOL's built-in parse-xml ?
- automatically expands character entities like & < etc.
- parses out the information contained in the XML document prolog
- handles CDATA sections
- handles comments
- handles processing instructions
and...
- provides a parser-callback handler interface modeled on SAX
- includes a handler that converts the parsed XML into a series of nested
blocks like REBOL's built-in parse-xml
The next major chunk of forthcoming functionality is XML Namespaces
processing. The namespace handling is 80% there, and there is a switch for
turning on/off namespace processing during parsing. As someone who builds
commercial XML products (shhh...in C++), I know that namespaces are often
vital to processing real-world XML documents such as XHTML, BizTalk, SOAP,
ebXML, etc.
I hope to have namespace functionality completely done by the end of the
weekend.
I built this for my own needs...and it made for a trial-by-fire experience
for learning to use REBOLs (wonderful) parse mechanism. I basically started
with the BNF production rules in the XML 1.0 spec and the XML Namespaces
spec.
I've got some more XML processing scripts that I'm working on polishing up.
Any comments, criticisms, suggestions are welcome.
The script has a lengthy Purpose: section in it -- no substitute for nice
HTML documentation, but for now it's the best I can do.
You can get the script at
http://www3.sympatico.ca/gavin.mckenzie/xml-parse.r
Gavin.