Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: ANN: xml-object.r , and...a question about REBOL's built-in parse-xm

From: joel::neely::fedex::com at: 5-Oct-2001 2:00

Hi, Gavin, Gavin F. McKenzie wrote:
> I've noticed some limitations in xml-object. If you have > element with an attribute and a subelement with the same > name, bad things happen. This should really be considered > poor form in XML.. >
Sorry, but I must emphatically disagree. This is equivalent to saying that recursive function call are bad form. XML markup shows semantic structure, and it is entirely legitimate that such structure be recursive in nature. One of the first serious applications I wrote in REBOL (in fact it was one of the main reasons I began using REBOL) was an XML- based web site generator which combines content from individual HTML files with an XML document that represents the structure of the site. It generates per-page "navigation bars" from the knowledge of where each page fits into the overall site, and generates the final pages by inserting content and navigation into templates. (Sorry for the long-winded background, but it is the reason for the example below.) A simplified version of the site file has content such as this (with ellipses standing for details beside the current point): <site docroot="/opt/netscape/suitespot/docs/devgroup/" source="/export/home/sitedev/devgroup/" ... > <page title="Home" file="index.html" ... > <page title="Our Mission" file="mission.html" .../> <page title="Our People" file="people.html" .../> <page title="Visit Us" file="map.html" .../> </page> <page title="Projects" file="proj.html" ... > <page title="Widgets" file="pr.3094.html" .../> <page title="Frobs" file="pr.3128.html" .../> <page title="Cruft" file="pr.3312.html" ... > <page title="Biggie" file="pr.3467.html" ... > <page title="ROI" file="roi.3467.html" .../> <page title="Budget" file="bud.3467.html" .../> </page> </page> ... </site> It is entirely reasonable to have some pages with sub-pages and others without. Pages are represented with PAGE elements whose location (nested within other PAGEs or not) in the XML document shows where they fit into the site structure. Since none of the information about a page (attributes of the PAGE element) is dependent on where the page is in the site, the site can be re-structured simply by moving one or more PAGE elements to a new place in the tree and re-running the generator (usually a 15- to 30-second effort). Although the "recursion" is indirect, standard HTML allows the nesting of tables and framesets. XHTML (essentially writing HTML with XML notation conventions) should allow these as well.
> I could also improve my mixed-content processing somewhat...anyway, more > work to do. > > Now...on to a question. > > ... when parse-xml encounters a XML declaration > > <?xml version ...?> > > it calls a function... > > check-version: func [version][print ["XML Version:" version]] > > ...which has the nasty side effect of printing out "XML Version" > with the version number. > > This message, of course, messes up my carefully crafter HTML > page that is produced from my REBOL server page. >
Disabling that one function is easy. I've made other modifications to xml-parser for other purposes as well. Here's some sample XML ...
>> foo: {
{ <?xml version="2.5" ?> { <motor productID="375-2385"> { <assembly productID="238-2356"> { <assembly productID="795-5837"/> { <assembly productID="123-4567"/> { </assembly> { <assembly productID="987-6543"> { </motor> { } == { <?xml version="2.5" ?> <motor productID="375-2385"> <assembly productID="238-2356"> <assembly productID="795-5837"... ... which shows your problem when parsed.
>> parse-xml foo
XML Version: 2.5 == [document none [["motor" ["productID" "375-2385"] ["^/ " ["assembly" ["productID" "238-2356"] ["^/ " ["assembly" ["pro... So, let's disable the offending function ...
>> xml-language: make xml-language [
[ check-version: func [version][] [ ] ... and parse again.
>> parse-xml foo
== [document none [["motor" ["productID" "375-2385"] ["^/ " ["assembly" ["productID" "238-2356"] ["^/ " ["assembly" ["pro... HTH! -jn- -- The end of all our exploring will be to arrive where we started and know the place for the first time. -- T.S. Eliot joel-dot-neely-FIX-PUNCTUATION-at-fedex-dot-com