SGML/XML support?
[1/2] from: dm128::microconnect::net at: 3-Jul-2001 19:53
Since I'm new to this excuse me, if this documented else where, I was unable
to find it.
<New to the whole thing working on chat client, so if my question is poorly
phrased again excuse me.>
Does rebol support opening up an SGML/XML or data stream, if so, how?
Thank you,
David
[2/2] from: joel:neely:fedex at: 3-Jul-2001 16:55
Hi, David,
David wrote:
> Does rebol support opening up an SGML/XML or data stream,
> if so, how?
>
Pardon me if I misunderstand the question, but you can
apply the PARSE-XML function to data from any source. The
argument to PARSE-XML is a string and the result is a block
structure described further below. For example:
foo: parse-xml read %data.xml
or
foo: parse-xml read http://slashdot.org/slashdot.rdf
The result will be a block structure of the form
[document none [
;content appears here
]
]
Each component of the content block is either an XML element
(another block) or a string. XML elements are represented by
three-item blocks where
* the first item is the name of the XML element (as a string),
* the second item is a block of name/value pairs for the
attributes in the element, or NONE if no attributes appeared,
* the third item is a block holding the contents of the XML
element (nested strings or other elements), or NONE for an
empty element.
For example:
>> my-x: {
{ <demo>
{ <height units="cm">243</height>
{ <case><veneer type="walnut" finish="glossy"/></case>
{ </demo>
{ }
== {
<demo>
<height units="cm">243</height>
<case><veneer type="walnut" finish="glossy"/></case>
</demo>
}
>> my-blk: parse-xml my-x
== [document none [["demo" none ["^/ " ["height" ["units"
"cm"] ["243"]] "^/ " ["case" none [["veneer" ["type"
"walnut" "finish" ...
>> my-blk/1
== document
>> my-blk/2
== none
>> my-blk/3
== [["demo" none ["^/ " ["height" ["units" "cm"] ["243"]]
"^/ " ["case" none [["veneer" ["type" "walnut" "finish"
"glossy"] none]...
>> my-blk/3/1
== ["demo" none ["^/ " ["height" ["units" "cm"] ["243"]]
"^/ " ["case" none [["veneer" ["type" "walnut" "finish"
"glossy"] none]]...
>> my-blk/3/1/3
== ["^/ " ["height" ["units" "cm"] ["243"]] "^/ " ["case"
none [["veneer" ["type" "walnut" "finish" "glossy"] none]]]
"^/"]
>> foreach item my-blk/3/1/3 [print mold item]
"^/ "
["height" ["units" "cm"] ["243"]]
"^/ "
["case" none [["veneer" ["type" "walnut" "finish" "glossy"]
none]]]
"^/"
The following function can "walk" through this structure,
testing each XML element with a selection function and calling
a processing function on each element that passes.
walkxml: func [
xmlb [block!]
sel [any-function!]
doer [any-function!]
/local _walk parents
][
parents: copy []
_walk: func [xel [block!]] [
insert/only parents xel
if do [sel parents] [do [doer parents]]
if found? third xel [
foreach item third xel [
if block? item [_walk item]
]
]
remove parents
]
_walk first third xmlb
exit
]
Using a simple "pass everything" selector and an "indent the
element names" processing function,
always: func [][true]
indent-element-name: func [elems [block!]] [
print [
head insert/dup copy "" " " length? elems
first first elems
]
]
we can see the structure of headlines data available from
slashdot
>> slash-xml: parse-xml read http://slashdot.org/slashdot.rdf
>> walkxml slash-xml :always :indent-element-name
rdf:RDF
channel
title
link
description
image
title
url
link
item
title
link
item
title
link
...etc...
Now that we know the structure, we can get all of the titles
by filtering for "title" tags and printing the strings each
one contains.
titles-only: func [elems [block!]] [
elems/1/1 = "title"
]
print-title: func [elems [block!]] [
print elems/1/3/1
]
>> walkxml slash-xml :titles-only :print-title
Slashdot: News for nerds, stuff that matters
Slashdot
Killustrator author required to pay two grand
Reverse Engineering .NET - Good, Bad or Inevitable?
Quantum Mechanics Symposium
The Great Computer Language Shootout
Pine/Pico License Misconceptions
Embracing Digital Photography
The Poverty Of Attention
From Serf to Surfer: Becoming a Network Consultant
MSNBC on Slashdot
Bionic Human: 1st Fully Implanted Human Heart
HTH!
-jn-
--
------------------------------------------------------------
Programming languages: compact, powerful, simple ...
Pick any two!
joel'dot'neely'at'fedex'dot'com