object2XML
[1/16] from: tbrownell:veleng at: 5-Mar-2004 14:34
Anyone have a simple example using Gavin's xml-object.r and xml-parse.r
that shows...
1. Reading an XML document.
2. Converting to an object for manipulation within rebol.
3. Converting that object back into XML are re-writning it back to the file
<person><name>Bob</name><email>[bob--spam--com]</email></person>
How would one change the email and save it back?
Thanks,
Terry
[2/16] from: jason:cunliffe:verizon at: 5-Mar-2004 17:51
Hi Terry
This Rebol Jabber client might be useful for you to study
http://www.rebolfrance.net/projets/concours/maoww.zip
- Jason
[3/16] from: robert:muench:robertmuench at: 6-Mar-2004 14:02
On Fri, 05 Mar 2004 14:34:03 -0800, Terry Brownell <[tbrownell--veleng--com]>
wrote:
> Anyone have a simple example using Gavin's xml-object.r and xml-parse.r
> that shows...
<<quoted lines omitted: 4>>
> <person><name>Bob</name><email>[bob--spam--com]</email></person>
> How would one change the email and save it back?
Hi, what's the big problem about it? You parse the XML into a stream of
tags and content. Than build a block with tag-words used as set-words of
an object, and the content (plain append should do the job) and than use
this block as prototype for a new object.
But you should think about using an object at all. I would use a block of
name/value pairs. Much easier to handle. Converting such a block back to
XML is easy. Yes, I know I shouldn't talk only about how to do it but do
it... but other things on my todo-list. Robert
[4/16] from: tbrownell:veleng at: 6-Mar-2004 22:09
It may not be a big deal to a guru, but I've spent a fair bit of time
going over everything I can find on Rebol and XML, and frankly, the
tools just aren't there. Or if they are, they're not well documented if
they are documented at all. I can easily convert nested blocks into xml
via xmlgen.r, but I can't see any way, yet an easy documented way to
a) read an xml doc
b) manipulate it rebol (ie poke)
c) convert back to rebol
I've never really needed to do this with Rebol before, so never gave it
much thought. But it strikes me that, given the continuing interest in
XML, Rebol will need to make a strong showing.
Robert M. M=C3=BCnch wrote:
[5/16] from: gchiu:compkarori at: 7-Mar-2004 9:10
T. Brownell wrote.. apparently on 6-Mar-2004/22:09:36-8:00
>tools just aren't there. Or if they are, they're not well documented if
>they are documented at all. I can easily convert nested blocks into xml
Gavin's documentation ( found by using RebolML's search function :) )
http://web.archive.org/web/20020210063622/www3.sympatico.ca/gavin.mckenzie/rebol/xml-object-info.html
--
Graham Chiu
http://www.compkarori.com/cerebrus
http://www.compkarori.com/rebolml
[6/16] from: inetw3:mindspring at: 8-Mar-2004 18:24
Hello T.B.,
Try my quickparser.r script at Rebol.org
Search for *quickparser*
Your right about Rebol and XML. I believe sooner or later
Carl will either add an xmlparser as a dll or functions to
the REBOL exe. The closer they get to an IE plug-in, the
more they will see Rebol really needs to deal with xhtml,
xml, html, etc. wether they like it or not or people at large will
not use it as a first class quick-fix scripting language.
Wich will kill rebol as a solution, it will be fragmented(IE),
and made into a nice but unpractical toy.
[7/16] from: atruter:labyrinth:au at: 9-Mar-2004 12:36
> wether they like it or not or people at large will
> not use it as a first class quick-fix scripting language.
This may be true for *some* people in *some* problem domains. Given the
variety of jobs / tasks REBOL is put to, I always have to laugh at the
various pronouncements that, "REBOL will die if it doesn't have [insert
feature / buzzword here]".
If it's not the right tool for the job, use a different tool (or redefine
the job).
> Wich will kill rebol as a solution, it will be fragmented(IE),
> and made into a nice but unpractical toy.
Same comment as above.
Regards,
Ashley
[8/16] from: inetw3:mindspring at: 8-Mar-2004 22:38
Hey Ashely,
It is funny that people get worked up over
whether or not if there wishlist stuff gets
implemented in a new or developing language,
and if it doesn't, they hope that language dies.
What did I mean by *fragmented(IE).........
>>If it's not the right tool for the job, use a different tool
My point exactly. We should want to be able to use Rebol.
(I know Carl Does)
I know Rebol will never die, it will just transform.
If it transform because of an IE plug-in, and it has,
we will work hard at creating a way to deal with
Internet Explorer from a Rebol point-of-view.
We or RT can create this non-GUI api, or everyone in
the spirit of Rebol can roll there own.
#Roll your own = fragmented(IE) scripting support. My script
can not easily work with your script. (Hey I'll just extend
parts of your script and call it my script! Now I'll sell my script!)
Rebol/Plug-in.exe Api = yet another Rebol exe to add to the
Rebol/whatevers. And to have it work for Mac,Linux,Opera,
Mozilla, etc. we must build more Rebol/whatevers or M$
programming will dominate how we deal with the In-the-Web
programming.
This is not a negative post. It just shows that no matter how you
spin it, whatever route you go with the browsers, XML plugs has
to be built into the Rebol API or it won't be practical to use.
Unless the Plug-in is only to launch Reblets from the browser,
than RT has done there job very well.
[9/16] from: bry:itnisk at: 9-Mar-2004 11:31
well it seems to me that the best solution
to that is support for LibXML, LibXSl
http://www.xmlsoft.org/
etc.
If everything was supported Rebol's xml
problems would be over.
> Hello T.B.,
>
> Try my quickparser.r script at Rebol.org
> Search for *quickparser*
>
> Your right about Rebol and XML. I believe
sooner or later
> Carl will either add an xmlparser as a dll
or functions to
> the REBOL exe. The closer they get to an
IE plug-in, the
> more they will see Rebol really needs to
deal with xhtml,
> xml, html, etc. wether they like it or not
or people at large will
> not use it as a first class quick-fix
scripting language.
> Wich will kill rebol as a solution, it
will be fragmented(IE),
> and made into a nice but unpractical toy.
> ----- Original Message -----
<<quoted lines omitted: 4>>
> >
> > XML can be both simple and complex. This
site is devoted to the creation
> > of the necessary tools and docs to make
the complex simple too using
> Rebol.
> >
> >
http://compkarori.com/vanilla/display/Simple+
XML
[10/16] from: robert:muench:robertmuench at: 9-Mar-2004 18:46
On Mon, 8 Mar 2004 18:24:11 -0600, iNetW3 <[inetw3--mindspring--com]> wrote:
> Your right about Rebol and XML. I believe sooner or later
> Carl will either add an xmlparser as a dll or functions to
> the REBOL exe. The closer they get to an IE plug-in, the
> more they will see Rebol really needs to deal with xhtml,
> xml, html, etc. wether they like it or not or people at large will
> not use it as a first class quick-fix scripting language.
Can you give a use-case for what this is required? Once in a while this
XML thing shows up over and over. If the demand is that high, why hasn't
anyone started to write one? If I use Rebol I only see usage for XML for
import or export. This can be done with the on-board tools quite good. Why
should I use XML stuff?
Think about why no full-blown XML parser exists for Rebol... this is a
much more interesting question ;-) Robert
[11/16] from: maximo:meteorstudios at: 9-Mar-2004 13:40
Hi Robert,
IMHO for one thing, it is like the "regular expression" (RE) topic which crops up now
and then.
People need an easy migration path. Anyone who has been convinced that xml is the end
of the world in ascii data sharing, will be more easily lured if that is more completely
supported.
for myself, I have found xml is nice to support at least on import because many open
source and on-line tools use XML as an export format. Things like namespaces, I have
read, will obfuscate rebol's xml engine... maybe we are simply lacking in the more advanced
features which are getting more common than they used to be...
I really am not an xml genius, I just noticed that it is a trendy and competent ascii
format, maybe if the rest of the world is using it... we should at least support it conveniently
so that the phrase:
REBOL is the glue that binds things together
would hold more meaning with regards to other tools which already expect their data to
be bound to other tools...
maybe the fact that no one has done a complete xml port is that no guru has spent the
better part of a month or two to do it.
My guess is that many would benefit, but not that many are abilitated to actually do
it... so they eventually turn to another solution..
in the near future, I will have to parse several MBs of textual xml data and I will see
at that point how rebol handles it. until then, I'm just an interested reader...
cheers! :-)
-MAx
---
You can either be part of the problem or part of the solution, but in the end, being
part of the problem is much more fun.
[12/16] from: bry:itnisk at: 9-Mar-2004 21:52
Well I'm reasonably knowledgable in matters
of xml usage, etc. although my rebol
knowledge is shit, given that I just use it
for small scripting hacks here and there
if I use gavin's xml-object and a function
to clean-up the output a bit:
doc-tree: func[unpickedDom][pick third
unpickedDom 1]
I get
>> xmldom: parse-xml read %t.xml
XML Version: 1.0
== [document none [["tag"
["r" "here" "xmlns:stuff" "http://www.x.com"]
["^/stuff here ^/" ["stuff:p" ["hi" "test"]
[["blah" n
one [...
>> t: doc-tree xmldom
== ["tag"
["r" "here" "xmlns:stuff" "http://www.x.com"]
["^/stuff here ^/" ["stuff:p" ["hi" "test"]
[["blah" none ["more"]]]]
^/
...
t is actually
["tag"
["r" "here" "xmlns:stuff" "http://www.x.com"]
["^/stuff here ^/" ["stuff:p" ["hi" "test"]
[["blah" none ["more"]]]] "^/
[
blah" none none] "^/"]]
now in this case I don't think the
namespaces are a problem, I don't understand
xml-object well enough to know if it fails
on namespace problems, but a namespace
function could be built easily enough to go
through the block getting all referenced
namespaces and checking against those
references whenever a usage is encountered.
Of course it should be noted that the
namespaces are placed in a block with the
attributes but I don't think that is a major
problem although there should of course be
functions for returning just attributes
without namespaces.
What I find more irritating is the textnodes:
I have 4 textnodes:
^/stuff here ^/
["more"]
^/
and again
^/
now none is used in an empty tag, but "^/"
is used for any empty textnode, and "^/
string^/" seems to be used for any textnode
that has a sibling node, whereas textnodes
that are only children are represented as a
block with one string value. it would
probably be better to just do that as
another ["^/string value^/"]
One of the things that should probably be
considered for any functions for working
with xml in rebol is optimizations for
working with various types of xml, for
example a document like structure such as we
see above (for which I would say the rule is
that a document structure has multiple
textnodes, that an element which has as a
direct child a textnode and an element is a
document structure) as opposed to the more
programmer friendly data type structure:
<customers>
<customer>
<name><fname>John</fname>
<lname>Simpson</lname>
</name>
.....
</customer>
</customers>
> Hi Robert,
>
> IMHO for one thing, it is like
the "regular expression" (RE) topic which
crops up now and
then.
> People need an easy migration path.
Anyone who has been convinced that xml is
the end of
the world in ascii data sharing, will be
more easily lured if that is more completely
supported.
> for myself, I have found xml is nice to
support at least on import because many open
source and on-line tools use XML as an
export format. Things like namespaces, I
have
read, will obfuscate rebol's xml engine...
maybe we are simply lacking in the more
advanced features which are getting more
common than they used to be...
> I really am not an xml genius, I just
noticed that it is a trendy and competent
ascii
format, maybe if the rest of the world is
using it... we should at least support it
conveniently so that the phrase:
> "REBOL is the glue that binds things
together"
> would hold more meaning with regards to
other tools which already expect their data
to be
bound to other tools...
> maybe the fact that no one has done a
complete xml port is that no guru has spent
the
better part of a month or two to do it.
> My guess is that many would benefit, but
not that many are abilitated to actually do
it...
so they eventually turn to another solution..
> in the near future, I will have to parse
several MBs of textual xml data and I will
see at
that point how rebol handles it. until
then, I'm just an interested reader...
>
> cheers! :-)
>
> -MAx
> ---
> "You can either be part of the problem or
part of the solution, but in the end, being
part
of the problem is much more fun."
[13/16] from: rebol:gavinmckenzie:fastmail:fm at: 9-Mar-2004 18:00
Comments below...
On Tue, 9 Mar 2004 21:52:59 CET, [bry--itnisk--com] said:
> Well I'm reasonably knowledgable in matters
> of xml usage, etc. although my rebol
<<quoted lines omitted: 6>>
> I get
> >> xmldom: parse-xml read %t.xml
You meant parse-xml+ right? parse-xml is the REBOL built-in parser.
>[snip]
> now in this case I don't think the
<<quoted lines omitted: 5>>
> namespaces and checking against those
> references whenever a usage is encountered.
What do you wish to do with namespaces? They aren't at all as
straightforward as they seem. They get inherited, and the namespace
prefixes can be reused within the nesting of the document all the while
resolving to totally different namespace URIs. The namespace processing,
if it has a chance, should be put into the parser itself and not in
xml-to-object or in some higher level processing. Adding in a namespace
aware SAX2-style handler into parse-xml is IMHO the only workable way to
go.
> Of course it should be noted that the
> namespaces are placed in a block with the
> attributes but I don't think that is a major
> problem although there should of course be
> functions for returning just attributes
> without namespaces.
>
But that's the thing: namespace declarations *look* like attributes, but
they really aren't as far as XML is concerned. They need to be treated
specially.
> What I find more irritating is the textnodes:
> I have 4 textnodes:
<<quoted lines omitted: 11>>
> probably be better to just do that as
> another ["^/string value^/"]
It depends what you want. Dropping any whitespace is a decision that can
only be made by the processing application and not the parser. The
parse-xml+ code has a set of default handlers, but you could choose to
implement your own. xml-to-object is intended to work with "data" styles
of XML and hence whitespace is more easily discarded in such XML without
too much risk.
> One of the things that should probably be
> considered for any functions for working
<<quoted lines omitted: 15>>
> </customer>
> </customers>
Agreed, the decisions about "optimizing" have to be done in light of the
type of XML you're processing and at a processing level above the parser,
not down in the parser itself.
[14/16] from: bry:itnisk at: 10-Mar-2004 11:06
> What do you wish to do with namespaces?
They aren't at all as
> straightforward as they seem. They get
inherited, and the namespace
> prefixes can be reused within the nesting
of the document all the while
> resolving to totally different namespace
URIs. The namespace processing,
> if it has a chance, should be put into the
parser itself and not in
> xml-to-object or in some higher level
processing. Adding in a namespace
> aware SAX2-style handler into parse-xml is
IMHO the only workable way to
> go.
I'm talking about having a library of
functions that call parse-xml that then do
the namespace conformance checking, why
would this be a good idea?
1. xml version 1.0 does not have any
connection to the namespace specification
(there is the following note from the
current version of the spec: The Namespaces
in XML Recommendation [XML Names] assigns a
meaning to names containing colon
characters. Therefore, authors should not
use the colon in XML names except for
namespace purposes, but XML processors must
accept the colon as a name character. Which
most processors do not accept the colon as a
name character without a namespace
declaration but as can be seen from the text
above that is incorrect), therefore one can
in fact have xml documents that have
elements called blah:text and have those
documents be well-formed, although of course
that is not industry standard practice (but
if you examine the svg put out by
Illustrator, Photoshop etc. you will notice
that when an xlink: namespace prefix is used
there is no xlink namespace declaration in
the document[this of course violates the
xlink spec but not the xml spec]).
Because of this it might be preferable to
layer the namespace handling in such a way
that one can build sricter levels of
specification(s) conformance.
> > Of course it should be noted that the
> > namespaces are placed in a block with
the
> > attributes but I don't think that is a
major
> > problem although there should of course
be
> > functions for returning just attributes
> > without namespaces.
> >
>
> But that's the thing: namespace
declarations *look* like attributes, but
> they really aren't as far as XML is
concerned. They need to be treated
> specially.
>
hence my making a differentiation between
them in my post. Again, to a straight
conformant xml 1.0 processor that an
attribute is called xmlns:hi means
absolutely nothing. To a processor that
understands both namespaces and xml 1.0 it
does mean something. Therefore, again, I
suppose that it is maybe useful to keep
namespace handling as functions seperate
from parse-xml.
> > What I find more irritating is the
textnodes:
> >
> > I have 4 textnodes:
<<quoted lines omitted: 6>>
> >
> > now none is used in an empty tag,
but "^/"
> > is used for any empty textnode, and "^/
> > string^/" seems to be used for any
textnode
> > that has a sibling node, whereas
textnodes
> > that are only children are represented
as a
> > block with one string value. it would
> > probably be better to just do that as
> > another ["^/string value^/"]
> >
>
> It depends what you want. Dropping any
whitespace is a decision that can
> only be made by the processing application
and not the parser. The
> parse-xml+ code has a set of default
handlers, but you could choose to
> implement your own. xml-to-object is
intended to work with "data" styles
> of XML and hence whitespace is more easily
discarded in such XML without
> too much risk.
Again that was not what I was complaining
about, I found the difference between how a
textnode was represented disconcerting for
the usage of a more strict parser built on
top of parse-xml. It seems to me
that "^/string value here" is a reasonable
way to signify that a node is a textnode,
since an element name can't start with a ^
and one would just not check to see if a
node were a textnode or element inside of an
attribute block.
NOTE: again, this is discussing the
possibilty of a generic xml processing
library of functions on top of parse-xml. so
that you could have a strip-empty-text func
that takes an rebolxmldom parameter, and
returns the rebolxmldom at the end with all
empty textnodes stripped out.
> >
> > One of the things that should probably
be
> > considered for any functions for working
> > with xml in rebol is optimizations for
> > working with various types of xml, for
> > example a document like structure such
as we
> > see above (for which I would say the
rule is
> > that a document structure has multiple
> > textnodes, that an element which has as
a
> > direct child a textnode and an element
is a
> > document structure) as opposed to the
more
> > programmer friendly data type structure:
> >
>
> Agreed, the decisions about "optimizing"
have to be done in light of the
> type of XML you're processing and at a
processing level above the parser,
> not down in the parser itself.
I'm suggesting that rather than having parse-
xml as the first and final way to read a
document, that one should have a library
built around parse-xml. So I'm not saying
that parse-xml should be fixed, I've come to
the conclusion that it is reasonably okay as
a starting point. Why is it reasonably okay,
because frankly there is a lot of non-
conformant xml out there that is, in usage,
accepted by different applications and
processors. I would as a general rule be
against working with such stuff but, for an
example, msxml accepts elements named xml,
according to the recommendation that name is
reserved: [Definition: A Name is a token
beginning with a letter or one of a few
punctuation characters, and continuing with
letters, digits, hyphens, underscores,
colons, or full stops, together known as
name characters.] Names beginning with the
string "xml", or with any string which would
match (('X'|'x') ('M'|'m') ('L'|'l')), are
reserved for standardization in this or
future versions of this specification
that of course wouldn't be so bad but a lot
of microsoft markup comes with elements
named xml in them. alot of people using only
msxml have xml documents with element names
like:
xml-metadata in them and such like. Probably
it would be a good thing if one could accept
those documents.
[15/16] from: bry:itnisk at: 10-Mar-2004 11:19
> What do you wish to do with namespaces?
They aren't at all as
> straightforward as they seem. They get
inherited, and the namespace
> prefixes can be reused within the nesting
of the document all the while
> resolving to totally different namespace
>URIs.
I believe this is what Joe English called
psychotic namespacing.
i've seen a lot of fucked up xml in my time,
it's actually quite rare to see:
<tag xmlns="http://tag.com"
xmlns:t="http://tag.com">
<t:tag>
<p>hi</p>
<t:tag xmlns:t="http://nottag.com">
<t:p>hi</t:p>
</t:tag>
</t:tag>
</tag>
but it is of course possible, my perspective
is to penalize that kind of structure, to
optimize for more common structures and hell
if it turns out in the middle of analyzing a
structure that it is this kind of mess, to
restart, so it takes longer, too bad.
Also from a namespace point of view the
prefix is absolutely meaningless, so one
could in fact process the above with a
namespace function that if it encountered a
prefix the same as one it has encountered
before but bound to a different namespace
then all it has to do is to autogenerate a
prefix, change the value in the block to
that prefix, and move on.
> The namespace processing,
> if it has a chance, should be put into the
parser itself and not in
> xml-to-object or in some higher level
processing. Adding in a namespace
> aware SAX2-style handler into parse-xml is
IMHO the only workable way to
> go.
Well I don't agree. for reasons given in
other post and this one.
[16/16] from: inetw3:mindspring at: 12-Mar-2004 10:09
Do you want the parsed xml displayed?
And if so, how are you wanting it to be
displayed.
I'm not familiar with seeing xml DOM's
showing parsed output xml unless it's
called with xml functions or through a
viewer/editor.
The functions chosen for %quickparse.r
are the ECMAscript binding functions, wich
I find a lot easier for use with webpages
and with inline javascript function calls in my
View browser,
ie...
<p id="p1"color="red">Change this text</p>
<input type="button"
onclick="getattribute(p1).setnodevalue({This text changed})" />
.....The changes are made in the html
paged and VID code wich can be saved .
and in rebol...
<input type="button"
onclick="p1/text.{This text changed}.show.p1" />
.....But the html is not changed and if the
paged is saved the original code remains.
There's no need to use parse-xml with
these functions because you can drill down
into any part of the xml/html file to make
changes and use them with Rebol code.
But in the spirit of Rebol,
"To each his/her own"
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted