[REBOL] Re: ANN: xml-object.r , and...a question about REBOL's built-in parse-xm
From: joel::neely::fedex::com at: 5-Oct-2001 2:00
Hi, Gavin,
Gavin F. McKenzie
wrote:
> I've noticed some limitations in xml-object. If you have
> element with an attribute and a subelement with the same
> name, bad things happen. This should really be considered
> poor form in XML..
>
Sorry, but I must emphatically disagree.
This is equivalent to saying that recursive function call are
bad form. XML markup shows semantic structure, and it is
entirely legitimate that such structure be recursive in nature.
One of the first serious applications I wrote in REBOL (in fact
it was one of the main reasons I began using REBOL) was an XML-
based web site generator which combines content from individual
HTML files with an XML document that represents the structure
of the site. It generates per-page "navigation bars" from the
knowledge of where each page fits into the overall site, and
generates the final pages by inserting content and navigation
into templates. (Sorry for the long-winded background, but it
is the reason for the example below.)
A simplified version of the site file has content such as this
(with ellipses standing for details beside the current point):
<site docroot="/opt/netscape/suitespot/docs/devgroup/"
source="/export/home/sitedev/devgroup/" ... >
<page title="Home" file="index.html" ... >
<page title="Our Mission" file="mission.html" .../>
<page title="Our People" file="people.html" .../>
<page title="Visit Us" file="map.html" .../>
</page>
<page title="Projects" file="proj.html" ... >
<page title="Widgets" file="pr.3094.html" .../>
<page title="Frobs" file="pr.3128.html" .../>
<page title="Cruft" file="pr.3312.html" ... >
<page title="Biggie" file="pr.3467.html" ... >
<page title="ROI" file="roi.3467.html" .../>
<page title="Budget" file="bud.3467.html" .../>
</page>
</page>
...
</site>
It is entirely reasonable to have some pages with sub-pages and
others without. Pages are represented with PAGE elements whose
location (nested within other PAGEs or not) in the XML document
shows where they fit into the site structure. Since none of the
information about a page (attributes of the PAGE element) is
dependent on where the page is in the site, the site can be
re-structured simply by moving one or more PAGE elements to a
new place in the tree and re-running the generator (usually a
15- to 30-second effort).
Although the "recursion" is indirect, standard HTML allows the
nesting of tables and framesets. XHTML (essentially writing
HTML with XML notation conventions) should allow these as well.
> I could also improve my mixed-content processing somewhat...anyway, more
> work to do.
>
> Now...on to a question.
>
> ... when parse-xml encounters a XML declaration
>
> <?xml version ...?>
>
> it calls a function...
>
> check-version: func [version][print ["XML Version:" version]]
>
> ...which has the nasty side effect of printing out "XML Version"
> with the version number.
>
> This message, of course, messes up my carefully crafter HTML
> page that is produced from my REBOL server page.
>
Disabling that one function is easy. I've made other modifications
to xml-parser for other purposes as well.
Here's some sample XML ...
>> foo: {
{ <?xml version="2.5" ?>
{ <motor productID="375-2385">
{ <assembly productID="238-2356">
{ <assembly productID="795-5837"/>
{ <assembly productID="123-4567"/>
{ </assembly>
{ <assembly productID="987-6543">
{ </motor>
{ }
== {
<?xml version="2.5" ?>
<motor productID="375-2385">
<assembly productID="238-2356">
<assembly productID="795-5837"...
... which shows your problem when parsed.
>> parse-xml foo
XML Version: 2.5
== [document none [["motor" ["productID" "375-2385"] ["^/ "
["assembly" ["productID" "238-2356"] ["^/ "
["assembly" ["pro...
So, let's disable the offending function ...
>> xml-language: make xml-language [
[ check-version: func [version][]
[ ]
... and parse again.
>> parse-xml foo
== [document none [["motor" ["productID" "375-2385"] ["^/ "
["assembly" ["productID" "238-2356"] ["^/ "
["assembly" ["pro...
HTH!
-jn-
--
The end of all our exploring will be to arrive where we started and
know the place for the first time.
-- T.S. Eliot
joel-dot-neely-FIX-PUNCTUATION-at-fedex-dot-com