Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

object2XML

 [1/16] from: tbrownell:veleng at: 5-Mar-2004 14:34


Anyone have a simple example using Gavin's xml-object.r and xml-parse.r that shows... 1. Reading an XML document. 2. Converting to an object for manipulation within rebol. 3. Converting that object back into XML are re-writning it back to the file <person><name>Bob</name><email>[bob--spam--com]</email></person> How would one change the email and save it back? Thanks, Terry

 [2/16] from: jason:cunliffe:verizon at: 5-Mar-2004 17:51


Hi Terry This Rebol Jabber client might be useful for you to study http://www.rebolfrance.net/projets/concours/maoww.zip - Jason

 [3/16] from: robert:muench:robertmuench at: 6-Mar-2004 14:02


On Fri, 05 Mar 2004 14:34:03 -0800, Terry Brownell <[tbrownell--veleng--com]> wrote:
> Anyone have a simple example using Gavin's xml-object.r and xml-parse.r > that shows...
<<quoted lines omitted: 4>>
> <person><name>Bob</name><email>[bob--spam--com]</email></person> > How would one change the email and save it back?
Hi, what's the big problem about it? You parse the XML into a stream of tags and content. Than build a block with tag-words used as set-words of an object, and the content (plain append should do the job) and than use this block as prototype for a new object. But you should think about using an object at all. I would use a block of name/value pairs. Much easier to handle. Converting such a block back to XML is easy. Yes, I know I shouldn't talk only about how to do it but do it... but other things on my todo-list. Robert

 [4/16] from: tbrownell:veleng at: 6-Mar-2004 22:09


It may not be a big deal to a guru, but I've spent a fair bit of time going over everything I can find on Rebol and XML, and frankly, the tools just aren't there. Or if they are, they're not well documented if they are documented at all. I can easily convert nested blocks into xml via xmlgen.r, but I can't see any way, yet an easy documented way to a) read an xml doc b) manipulate it rebol (ie poke) c) convert back to rebol I've never really needed to do this with Rebol before, so never gave it much thought. But it strikes me that, given the continuing interest in XML, Rebol will need to make a strong showing. Robert M. M=C3=BCnch wrote:

 [5/16] from: gchiu:compkarori at: 7-Mar-2004 9:10


T. Brownell wrote.. apparently on 6-Mar-2004/22:09:36-8:00
>tools just aren't there. Or if they are, they're not well documented if >they are documented at all. I can easily convert nested blocks into xml
Gavin's documentation ( found by using RebolML's search function :) ) http://web.archive.org/web/20020210063622/www3.sympatico.ca/gavin.mckenzie/rebol/xml-object-info.html -- Graham Chiu http://www.compkarori.com/cerebrus http://www.compkarori.com/rebolml

 [6/16] from: inetw3:mindspring at: 8-Mar-2004 18:24


Hello T.B., Try my quickparser.r script at Rebol.org Search for *quickparser* Your right about Rebol and XML. I believe sooner or later Carl will either add an xmlparser as a dll or functions to the REBOL exe. The closer they get to an IE plug-in, the more they will see Rebol really needs to deal with xhtml, xml, html, etc. wether they like it or not or people at large will not use it as a first class quick-fix scripting language. Wich will kill rebol as a solution, it will be fragmented(IE), and made into a nice but unpractical toy.

 [7/16] from: atruter:labyrinth:au at: 9-Mar-2004 12:36


> wether they like it or not or people at large will > not use it as a first class quick-fix scripting language.
This may be true for *some* people in *some* problem domains. Given the variety of jobs / tasks REBOL is put to, I always have to laugh at the various pronouncements that, "REBOL will die if it doesn't have [insert feature / buzzword here]". If it's not the right tool for the job, use a different tool (or redefine the job).
> Wich will kill rebol as a solution, it will be fragmented(IE), > and made into a nice but unpractical toy.
Same comment as above. Regards, Ashley

 [8/16] from: inetw3:mindspring at: 8-Mar-2004 22:38


Hey Ashely, It is funny that people get worked up over whether or not if there wishlist stuff gets implemented in a new or developing language, and if it doesn't, they hope that language dies. What did I mean by *fragmented(IE).........
>>If it's not the right tool for the job, use a different tool
My point exactly. We should want to be able to use Rebol. (I know Carl Does) I know Rebol will never die, it will just transform. If it transform because of an IE plug-in, and it has, we will work hard at creating a way to deal with Internet Explorer from a Rebol point-of-view. We or RT can create this non-GUI api, or everyone in the spirit of Rebol can roll there own. #Roll your own = fragmented(IE) scripting support. My script can not easily work with your script. (Hey I'll just extend parts of your script and call it my script! Now I'll sell my script!) Rebol/Plug-in.exe Api = yet another Rebol exe to add to the Rebol/whatevers. And to have it work for Mac,Linux,Opera, Mozilla, etc. we must build more Rebol/whatevers or M$ programming will dominate how we deal with the In-the-Web programming. This is not a negative post. It just shows that no matter how you spin it, whatever route you go with the browsers, XML plugs has to be built into the Rebol API or it won't be practical to use. Unless the Plug-in is only to launch Reblets from the browser, than RT has done there job very well.

 [9/16] from: bry:itnisk at: 9-Mar-2004 11:31


well it seems to me that the best solution to that is support for LibXML, LibXSl http://www.xmlsoft.org/ etc. If everything was supported Rebol's xml problems would be over.
> Hello T.B., > > Try my quickparser.r script at Rebol.org > Search for *quickparser* > > Your right about Rebol and XML. I believe
sooner or later
> Carl will either add an xmlparser as a dll
or functions to
> the REBOL exe. The closer they get to an
IE plug-in, the
> more they will see Rebol really needs to
deal with xhtml,
> xml, html, etc. wether they like it or not
or people at large will
> not use it as a first class quick-fix
scripting language.
> Wich will kill rebol as a solution, it
will be fragmented(IE),
> and made into a nice but unpractical toy. > ----- Original Message -----
<<quoted lines omitted: 4>>
> > > > XML can be both simple and complex. This
site is devoted to the creation
> > of the necessary tools and docs to make
the complex simple too using
> Rebol. > > > >
http://compkarori.com/vanilla/display/Simple+ XML

 [10/16] from: robert:muench:robertmuench at: 9-Mar-2004 18:46


On Mon, 8 Mar 2004 18:24:11 -0600, iNetW3 <[inetw3--mindspring--com]> wrote:
> Your right about Rebol and XML. I believe sooner or later > Carl will either add an xmlparser as a dll or functions to > the REBOL exe. The closer they get to an IE plug-in, the > more they will see Rebol really needs to deal with xhtml, > xml, html, etc. wether they like it or not or people at large will > not use it as a first class quick-fix scripting language.
Can you give a use-case for what this is required? Once in a while this XML thing shows up over and over. If the demand is that high, why hasn't anyone started to write one? If I use Rebol I only see usage for XML for import or export. This can be done with the on-board tools quite good. Why should I use XML stuff? Think about why no full-blown XML parser exists for Rebol... this is a much more interesting question ;-) Robert

 [11/16] from: maximo:meteorstudios at: 9-Mar-2004 13:40


Hi Robert, IMHO for one thing, it is like the "regular expression" (RE) topic which crops up now and then. People need an easy migration path. Anyone who has been convinced that xml is the end of the world in ascii data sharing, will be more easily lured if that is more completely supported. for myself, I have found xml is nice to support at least on import because many open source and on-line tools use XML as an export format. Things like namespaces, I have read, will obfuscate rebol's xml engine... maybe we are simply lacking in the more advanced features which are getting more common than they used to be... I really am not an xml genius, I just noticed that it is a trendy and competent ascii format, maybe if the rest of the world is using it... we should at least support it conveniently so that the phrase: REBOL is the glue that binds things together would hold more meaning with regards to other tools which already expect their data to be bound to other tools... maybe the fact that no one has done a complete xml port is that no guru has spent the better part of a month or two to do it. My guess is that many would benefit, but not that many are abilitated to actually do it... so they eventually turn to another solution.. in the near future, I will have to parse several MBs of textual xml data and I will see at that point how rebol handles it. until then, I'm just an interested reader... cheers! :-) -MAx --- You can either be part of the problem or part of the solution, but in the end, being part of the problem is much more fun.

 [12/16] from: bry:itnisk at: 9-Mar-2004 21:52


Well I'm reasonably knowledgable in matters of xml usage, etc. although my rebol knowledge is shit, given that I just use it for small scripting hacks here and there if I use gavin's xml-object and a function to clean-up the output a bit: doc-tree: func[unpickedDom][pick third unpickedDom 1] I get
>> xmldom: parse-xml read %t.xml
XML Version: 1.0 == [document none [["tag" ["r" "here" "xmlns:stuff" "http://www.x.com"] ["^/stuff here ^/" ["stuff:p" ["hi" "test"] [["blah" n one [...
>> t: doc-tree xmldom
== ["tag" ["r" "here" "xmlns:stuff" "http://www.x.com"] ["^/stuff here ^/" ["stuff:p" ["hi" "test"] [["blah" none ["more"]]]] ^/ ... t is actually ["tag" ["r" "here" "xmlns:stuff" "http://www.x.com"] ["^/stuff here ^/" ["stuff:p" ["hi" "test"] [["blah" none ["more"]]]] "^/ [ blah" none none] "^/"]] now in this case I don't think the namespaces are a problem, I don't understand xml-object well enough to know if it fails on namespace problems, but a namespace function could be built easily enough to go through the block getting all referenced namespaces and checking against those references whenever a usage is encountered. Of course it should be noted that the namespaces are placed in a block with the attributes but I don't think that is a major problem although there should of course be functions for returning just attributes without namespaces. What I find more irritating is the textnodes: I have 4 textnodes: ^/stuff here ^/ ["more"] ^/ and again ^/ now none is used in an empty tag, but "^/" is used for any empty textnode, and "^/ string^/" seems to be used for any textnode that has a sibling node, whereas textnodes that are only children are represented as a block with one string value. it would probably be better to just do that as another ["^/string value^/"] One of the things that should probably be considered for any functions for working with xml in rebol is optimizations for working with various types of xml, for example a document like structure such as we see above (for which I would say the rule is that a document structure has multiple textnodes, that an element which has as a direct child a textnode and an element is a document structure) as opposed to the more programmer friendly data type structure: <customers> <customer> <name><fname>John</fname> <lname>Simpson</lname> </name> ..... </customer> </customers>
> Hi Robert, > > IMHO for one thing, it is like
the "regular expression" (RE) topic which crops up now and then.
> People need an easy migration path.
Anyone who has been convinced that xml is the end of the world in ascii data sharing, will be more easily lured if that is more completely supported.
> for myself, I have found xml is nice to
support at least on import because many open source and on-line tools use XML as an export format. Things like namespaces, I have read, will obfuscate rebol's xml engine... maybe we are simply lacking in the more advanced features which are getting more common than they used to be...
> I really am not an xml genius, I just
noticed that it is a trendy and competent ascii format, maybe if the rest of the world is using it... we should at least support it conveniently so that the phrase:
> "REBOL is the glue that binds things
together"
> would hold more meaning with regards to
other tools which already expect their data to be bound to other tools...
> maybe the fact that no one has done a
complete xml port is that no guru has spent the better part of a month or two to do it.
> My guess is that many would benefit, but
not that many are abilitated to actually do it... so they eventually turn to another solution..
> in the near future, I will have to parse
several MBs of textual xml data and I will see at that point how rebol handles it. until then, I'm just an interested reader...
> > cheers! :-) > > -MAx > --- > "You can either be part of the problem or
part of the solution, but in the end, being part of the problem is much more fun."

 [13/16] from: rebol:gavinmckenzie:fastmail:fm at: 9-Mar-2004 18:00


Comments below... On Tue, 9 Mar 2004 21:52:59 CET, [bry--itnisk--com] said:
> Well I'm reasonably knowledgable in matters > of xml usage, etc. although my rebol
<<quoted lines omitted: 6>>
> I get > >> xmldom: parse-xml read %t.xml
You meant parse-xml+ right? parse-xml is the REBOL built-in parser.
>[snip] > now in this case I don't think the
<<quoted lines omitted: 5>>
> namespaces and checking against those > references whenever a usage is encountered.
What do you wish to do with namespaces? They aren't at all as straightforward as they seem. They get inherited, and the namespace prefixes can be reused within the nesting of the document all the while resolving to totally different namespace URIs. The namespace processing, if it has a chance, should be put into the parser itself and not in xml-to-object or in some higher level processing. Adding in a namespace aware SAX2-style handler into parse-xml is IMHO the only workable way to go.
> Of course it should be noted that the > namespaces are placed in a block with the > attributes but I don't think that is a major > problem although there should of course be > functions for returning just attributes > without namespaces. >
But that's the thing: namespace declarations *look* like attributes, but they really aren't as far as XML is concerned. They need to be treated specially.
> What I find more irritating is the textnodes: > I have 4 textnodes:
<<quoted lines omitted: 11>>
> probably be better to just do that as > another ["^/string value^/"]
It depends what you want. Dropping any whitespace is a decision that can only be made by the processing application and not the parser. The parse-xml+ code has a set of default handlers, but you could choose to implement your own. xml-to-object is intended to work with "data" styles of XML and hence whitespace is more easily discarded in such XML without too much risk.
> One of the things that should probably be > considered for any functions for working
<<quoted lines omitted: 15>>
> </customer> > </customers>
Agreed, the decisions about "optimizing" have to be done in light of the type of XML you're processing and at a processing level above the parser, not down in the parser itself.

 [14/16] from: bry:itnisk at: 10-Mar-2004 11:06


> What do you wish to do with namespaces?
They aren't at all as
> straightforward as they seem. They get
inherited, and the namespace
> prefixes can be reused within the nesting
of the document all the while
> resolving to totally different namespace
URIs. The namespace processing,
> if it has a chance, should be put into the
parser itself and not in
> xml-to-object or in some higher level
processing. Adding in a namespace
> aware SAX2-style handler into parse-xml is
IMHO the only workable way to
> go.
I'm talking about having a library of functions that call parse-xml that then do the namespace conformance checking, why would this be a good idea? 1. xml version 1.0 does not have any connection to the namespace specification (there is the following note from the current version of the spec: The Namespaces in XML Recommendation [XML Names] assigns a meaning to names containing colon characters. Therefore, authors should not use the colon in XML names except for namespace purposes, but XML processors must accept the colon as a name character. Which most processors do not accept the colon as a name character without a namespace declaration but as can be seen from the text above that is incorrect), therefore one can in fact have xml documents that have elements called blah:text and have those documents be well-formed, although of course that is not industry standard practice (but if you examine the svg put out by Illustrator, Photoshop etc. you will notice that when an xlink: namespace prefix is used there is no xlink namespace declaration in the document[this of course violates the xlink spec but not the xml spec]). Because of this it might be preferable to layer the namespace handling in such a way that one can build sricter levels of specification(s) conformance.
> > Of course it should be noted that the > > namespaces are placed in a block with
the
> > attributes but I don't think that is a
major
> > problem although there should of course
be
> > functions for returning just attributes > > without namespaces. > > > > But that's the thing: namespace
declarations *look* like attributes, but
> they really aren't as far as XML is
concerned. They need to be treated
> specially. >
hence my making a differentiation between them in my post. Again, to a straight conformant xml 1.0 processor that an attribute is called xmlns:hi means absolutely nothing. To a processor that understands both namespaces and xml 1.0 it does mean something. Therefore, again, I suppose that it is maybe useful to keep namespace handling as functions seperate from parse-xml.
> > What I find more irritating is the
textnodes:
> > > > I have 4 textnodes:
<<quoted lines omitted: 6>>
> > > > now none is used in an empty tag,
but "^/"
> > is used for any empty textnode, and "^/ > > string^/" seems to be used for any
textnode
> > that has a sibling node, whereas
textnodes
> > that are only children are represented
as a
> > block with one string value. it would > > probably be better to just do that as > > another ["^/string value^/"] > > > > It depends what you want. Dropping any
whitespace is a decision that can
> only be made by the processing application
and not the parser. The
> parse-xml+ code has a set of default
handlers, but you could choose to
> implement your own. xml-to-object is
intended to work with "data" styles
> of XML and hence whitespace is more easily
discarded in such XML without
> too much risk.
Again that was not what I was complaining about, I found the difference between how a textnode was represented disconcerting for the usage of a more strict parser built on top of parse-xml. It seems to me that "^/string value here" is a reasonable way to signify that a node is a textnode, since an element name can't start with a ^ and one would just not check to see if a node were a textnode or element inside of an attribute block. NOTE: again, this is discussing the possibilty of a generic xml processing library of functions on top of parse-xml. so that you could have a strip-empty-text func that takes an rebolxmldom parameter, and returns the rebolxmldom at the end with all empty textnodes stripped out.
> > > > One of the things that should probably
be
> > considered for any functions for working > > with xml in rebol is optimizations for > > working with various types of xml, for > > example a document like structure such
as we
> > see above (for which I would say the
rule is
> > that a document structure has multiple > > textnodes, that an element which has as
a
> > direct child a textnode and an element
is a
> > document structure) as opposed to the
more
> > programmer friendly data type structure: > > > > Agreed, the decisions about "optimizing"
have to be done in light of the
> type of XML you're processing and at a
processing level above the parser,
> not down in the parser itself.
I'm suggesting that rather than having parse- xml as the first and final way to read a document, that one should have a library built around parse-xml. So I'm not saying that parse-xml should be fixed, I've come to the conclusion that it is reasonably okay as a starting point. Why is it reasonably okay, because frankly there is a lot of non- conformant xml out there that is, in usage, accepted by different applications and processors. I would as a general rule be against working with such stuff but, for an example, msxml accepts elements named xml, according to the recommendation that name is reserved: [Definition: A Name is a token beginning with a letter or one of a few punctuation characters, and continuing with letters, digits, hyphens, underscores, colons, or full stops, together known as name characters.] Names beginning with the string "xml", or with any string which would match (('X'|'x') ('M'|'m') ('L'|'l')), are reserved for standardization in this or future versions of this specification that of course wouldn't be so bad but a lot of microsoft markup comes with elements named xml in them. alot of people using only msxml have xml documents with element names like: xml-metadata in them and such like. Probably it would be a good thing if one could accept those documents.

 [15/16] from: bry:itnisk at: 10-Mar-2004 11:19


> What do you wish to do with namespaces?
They aren't at all as
> straightforward as they seem. They get
inherited, and the namespace
> prefixes can be reused within the nesting
of the document all the while
> resolving to totally different namespace >URIs.
I believe this is what Joe English called psychotic namespacing. i've seen a lot of fucked up xml in my time, it's actually quite rare to see: <tag xmlns="http://tag.com" xmlns:t="http://tag.com"> <t:tag> <p>hi</p> <t:tag xmlns:t="http://nottag.com"> <t:p>hi</t:p> </t:tag> </t:tag> </tag> but it is of course possible, my perspective is to penalize that kind of structure, to optimize for more common structures and hell if it turns out in the middle of analyzing a structure that it is this kind of mess, to restart, so it takes longer, too bad. Also from a namespace point of view the prefix is absolutely meaningless, so one could in fact process the above with a namespace function that if it encountered a prefix the same as one it has encountered before but bound to a different namespace then all it has to do is to autogenerate a prefix, change the value in the block to that prefix, and move on.
> The namespace processing, > if it has a chance, should be put into the
parser itself and not in
> xml-to-object or in some higher level
processing. Adding in a namespace
> aware SAX2-style handler into parse-xml is
IMHO the only workable way to
> go.
Well I don't agree. for reasons given in other post and this one.

 [16/16] from: inetw3:mindspring at: 12-Mar-2004 10:09


Do you want the parsed xml displayed? And if so, how are you wanting it to be displayed. I'm not familiar with seeing xml DOM's showing parsed output xml unless it's called with xml functions or through a viewer/editor. The functions chosen for %quickparse.r are the ECMAscript binding functions, wich I find a lot easier for use with webpages and with inline javascript function calls in my View browser, ie... <p id="p1"color="red">Change this text</p> <input type="button" onclick="getattribute(p1).setnodevalue({This text changed})" /> .....The changes are made in the html paged and VID code wich can be saved . and in rebol... <input type="button" onclick="p1/text.{This text changed}.show.p1" /> .....But the html is not changed and if the paged is saved the original code remains. There's no need to use parse-xml with these functions because you can drill down into any part of the xml/html file to make changes and use them with Rebol code. But in the spirit of Rebol, "To each his/her own"

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted