Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: What's the 'none' for in the parse-xml result?

From: gavin:mckenzie:sympatico:ca at: 11-Jul-2001 9:42

On July 11, 2001 3:20 AM Joel Neely wrote:
>Just take your example a little further. >[snip] >The block REBOL produces for an XML element contains the >element name, attribute list, and content, in that order. >[snip] >An element that has no attributes has NONE for its second >part, just as an element that has no content has NONE for >its third part. Each item in the content block (if there >is one) will either be a string or a block (of similar >structure) for a subordinate element.
Yes...I did know this, and I've enjoyed your previous submissions on helper functions for accessing the sub-structures of a parsed-xml block.
>>[snip] >Based on looking at the code for XML-LANGUAGE, my conclusion >was that the block for the top-level document was simply >another block that followed the above structure (to avoid >fencepost issues).
You may be right. I may be reading too much into it. The reason why I assumed that it might be intentional was because the notion of a top level 'document' structure that contains meta-information about the document (such as the DocumentType enclosing the prolog) itself is consistent with W3C XML DOM. Check out the IDL at: http://www.w3.org/TR/DOM-Level-2-Core/idl-definitions.html In normal DOM based XML processing I'm used to dealing with a "document" object that contains a handle the the "document element" i.e. the root element of the document. This is consistent with the block structure returned by parse-xml.
>I wrote extensions to handle comments and CDATA a while back, >and had thought about doing an article on XML in REBOL. (Are >you interested in collaborating?) But I'm not sure what you >have in mind for namespaces. Were you thinking of actually >writing a validating parser?
Nooo...I wasn't going to go down the validation route, that's more than I need. It's just that without some support for entities, and CDATA sections, it's hard to process real-world XML data. By real-world XML data, I mean XML data that someone else created, hence you don't have the ability to constrain the amount of XML 1.0 functionality employed. Same thing for namespaces. If you have to deal with any sort of XML applications that package/envelope the content (e.g. SOAP, BizTalk, most XML EDI applications) then invariably you end up with one or two common circumstances: 1. Your XML data is enclosed in an 'envelope' denoted by a namespace 2. Your XML data contains data belonging to a namespace foreign to your original data Either of these circumstances require the ability to filter/mask or at least recognize namespace information. My plan was to add namespace info into the block structure. I've also created a SAX-style callback interface for occasions when you want to process an XML document in a streaming manner rather than suck the whole document into memory. Interested in collaborating? Heck...I'd be pleased. Though your REBOL expertise would outclass mine. I can offer XML expertise...XML (and its associated specs Namespaces/Schema/XSLT/DSig/etc.) is all I've been doing for four years. I'll post my parse-xml replacement tonight for (critical) review. Basically I've pretty much just used the BNF production rules from the XML 1.0 spec. Gavin.