Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Rebol & XML

 [1/31] from: AJMartin::orcon::net::nz at: 5-Aug-2003 22:04


Thanks, Bryan and Will! Bryan wrote:
> I've thought about doing the same, mainly cause I want to have xpath in
Rebol, and to do that I need a decent xml parser. I'm sure you're better qualified than me for doing it but if you need any help on the project I'd be glad to help. I've discovered that Gavin's parse-xml is based on SAX or the event model of processing XML. At the moment, my thoughts are going towards a DOM model, because Rebol is oriented that way, I feel, in reading and writing all of a file at once. The DOM model builds a tree in memory. I want to access the various values with path! values in Rebol. Here's a little XML (XMLSS from MS Excel 2002): XML: {<?xml version="1.0"?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> <Author>Andrew John Martin</Author> <LastAuthor>Andrew John Martin</LastAuthor> <Created>2003-08-05T02:10:56Z</Created> <LastSaved>2003-08-05T02:10:57Z</LastSaved> <Company>Colenso High School</Company> <Version>10.4219</Version> </DocumentProperties> <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> <DownloadComponents/> <LocationOfComponents HRef="file:///\\"/> </OfficeDocumentSettings> </Workbook> } I'd like to processs the above and then access the author's name with Rebol script like: XML/Workbook/DocumentProperties/Author And set it with Rebol script like: XML/Workbook/DocumentProperties/Author: "Andrew Martin" Also we should think about several tags at the same level of nesting, like in table: row cell cell cell Unfortunately, there's a problem with accessing the attributes of a tag! For example, what's the path! value for accessing the value of the "xmlns" attribute in the "DocumentProperties" tag? XML/Workbook/DocumentProperties/________ Or perhaps I could use: XML/Workbook/DocumentProperties/_Attribute/xmlns Where "_Attribute" is the magic word for accessing attributes of a tag? What do people think? Is there a better or more simpler way that I've overlooked? Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [2/31] from: bry:itnisk at: 5-Aug-2003 15:06


> At the moment, my thoughts are going towards a DOM model, >because Rebol is oriented that way, I feel, in reading and writing all
of a
>file at once.
Definitely should be DOM, dom is more familiar to most developers and more popular than SAX. [ The DOM model builds a tree in memory. I want to access the various values with path! values in Rebol. Here's a little XML (XMLSS from MS Excel 2002): XML: {<?xml version="1.0"?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> <Author>Andrew John Martin</Author> <LastAuthor>Andrew John Martin</LastAuthor> <Created>2003-08-05T02:10:56Z</Created> <LastSaved>2003-08-05T02:10:57Z</LastSaved> <Company>Colenso High School</Company> <Version>10.4219</Version> </DocumentProperties> <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> <DownloadComponents/> <LocationOfComponents HRef="file:///\\"/> </OfficeDocumentSettings> </Workbook> } ]
>I'd like to processs the above and then access the author's name with
Rebol
>script like: > XML/Workbook/DocumentProperties/Author
which is basically an xpath. I think it should probably be something like xpath XML "/Workbook/DocumentProperties/Author"
>And set it with Rebol script like: > XML/Workbook/DocumentProperties/Author: "Andrew Martin"
yeah that was something I was also considering, the possibility of an xpath setting syntax in Rebol.
>Also we should think about several tags at the same level of nesting,
like
>in table: > row > cell > cell > cell
in the xpath data model of xml this would be taken care of via position() http://www.w3.org/TR/xpath#section-Node-Set-Functions so that one has row/cell[last()] returning the last cell node under row row/cell[position() = 2] or row/cell[2] returning the second. My idea was to have an object hierarchy that could be navigated in the normal rebol manner, than have an xpath parser that would parse out xpath strings to figure out the rebol path to something. This might have problems though.
>Unfortunately, there's a problem with accessing the attributes of a
tag! >For
>example, what's the path! value for accessing the value of the "xmlns" >attribute in the "DocumentProperties" tag?
<<quoted lines omitted: 4>>
>What do people think? Is there a better or more simpler way that I've >overlooked?
xmlns is a namespace declaration and as such not an actual attribute, depending on what specifications your parser supports, a completely valid xml parser supporting just the original xml specification would consider that as an attribute, however most parsers do not consider that as an attribute because they also support namespaces. well I think it needs to be abstracted one level so the information we get out is something like this (this is probably horribly wrong since I haven't had much occasion to use make object!, and that I did have was a while ago): xml: make object! [ element: make object![ name: "Workbook" attributes: [] default-namespace: "urn:schemas-microsoft-com:office:spreadsheet" namespaces:[o: "urn:schemas-microsoft-com:office:office" x: "urn:schemas-microsoft-com:office:excel" ss: "urn:schemas-microsoft-com:office:spreadsheet" html : "http://www.w3.org/TR/REC-html40"] childtree: make object![ element: make object![ name: "DocumentProperties" .................... and so forth.................... ] ] ] ] consider if this has to handle xml like the following: <doc> <section>hi <p att="here">text</p> some more text</section> </doc> there has to be a way to get ahold of the various text nodes. There are three textnodes under section. So we would need something like this xml: make object![ element: make object![ name: "doc" childtree: make object![ element: make object![ name: "section" childtree: make object![ t1: "hi" element: make object![ name: "p" attributes: [ att: "here" ] t1: "text" ] t2: "some more text" ] ] ] ] ] okay, enough of that you get the point, it could probably be better designed, but problems here: if the name of an element has a namespace prefix: element: "svg:svg" then of course the svg prefix needs to be associated somewhere with the svg namespace. The same if an attribute is associated with a namespace prefix (this is very rare) Namespaces can be tricky, people have a lot of preconceptions about them that do not always bear out, different xml dialects have subtly different namespace processing models. Case in point is svg processing model which insists that if an svg namespaced element is within an element in a namespace the processor is unfamiliar with then the svg namespaced element is removed from the parse tree. Most xml dialects of course have a model of ignoring the unknown namespace and forging ahead. It might be possible to have a top-level object that holds all document namespaces, and use this as a way to optimize namespace checking, most of the time namespaces are declared on the document element, if a namespace isn't found there one can then try checking for it in the local tree, but if it is there than one does not have to check in the local tree. The structure above of course means that you can't have as you wanted before XML/Workbook/DocumentProperties But with this one could build an xpath interpreter ontop of it, or a lightweight one really quick that allowed you to write that and then went throught the steps. It would then also allow for us to have functions like: documentElement myxml which would return "Workbook" i.e. it would be possible to actually have something similar to a DOM implementation for Rebol.

 [3/31] from: brett:codeconscious at: 5-Aug-2003 23:50


Hi Andrew,
> I've discovered that Gavin's parse-xml is based on SAX or the event model
of
> processing XML. At the moment, my thoughts are going towards a DOM model, > because Rebol is oriented that way, I feel, in reading and writing all of
a
> file at once. The DOM model builds a tree in memory. I want to access the > ...
It has been a while since I looked at it, but I thought that the xml-object that Gavin wrote did build a tree of objects. However I seem to recall that the parser was not entirely complete.
> I'd like to processs the above and then access the author's name with
Rebol
> script like: > > XML/Workbook/DocumentProperties/Author > > And set it with Rebol script like: > > XML/Workbook/DocumentProperties/Author: "Andrew Martin"
It certainly looks neat, even if your underlying XML structure doesn't support it, maybe a dialect can do the translation of your request. A question to ponder though. How many instances of these would be literally in your code, or would you need to build them up in other code?
> Also we should think about several tags at the same level of nesting, like > in table: > > row > cell > cell > cell
My XML is very limited. If we write a program to access a cell from the row in this example are we implicitly encoding the structure of XML (DTD) into a program?
> Unfortunately, there's a problem with accessing the attributes of a tag!
For
> example, what's the path! value for accessing the value of the "xmlns" > attribute in the "DocumentProperties" tag? > > XML/Workbook/DocumentProperties/________ > > Or perhaps I could use: > > XML/Workbook/DocumentProperties/_Attribute/xmlns > > Where "_Attribute" is the magic word for accessing attributes of a tag? >
Perhaps the attributes of each element can be held seperately from the element structures, and the same for namespaces. The path notation then becomes a key to each of these three storage structures. Therefore different aspects of a node could be: Content XML/Workbook/DocumentProperties Attributes XML/Workbook/DocumentProperties Namespace XML/Workbook/DocumentProperties etc. This is not well thought through - just throwing some ideas out. By the way does namespaces represent contracts for behaviour, or are they like rebol objects - a way to provide seperate contexts, or are they both?
> What do people think? Is there a better or more simpler way that I've > overlooked?
I wish :^) I think that if you consider the different ways you might want to process the XML you will find different representations that can be useful. Also if a DTD (are these obsolete yet?) is available - the extra information might have an impact. E.g using a DTD maybe a program could be generated that knows how to traverse the XML implicitly (because it was generated to do so). Not being much help am I? :^) Regards, Brett.

 [4/31] from: nitsch-lists:netcologne at: 5-Aug-2003 16:59


Hi Andrew and all, how about XML/Workbook@/Document@/Author@ change find XML/Workbook@ Author@ "Andrew Martin" Idea is to name tag as email, attributes as issue. then one could write XML/Workbook@/#myAtribute too. [Workbook@ [ #xmlns ".." DocumentProperties@ [ #xmlns ".." Autor@ "Andrew John Martin" ] OfficeDocumentSetting@ [#xmlns LocationOfComponents@ [#HRef="file:///\\"] ] ]] the following snippet changes parse-xml to return attributes as issues. now i can write XML/3/1/2/#from. but tags should not be indexed, but named too IMHO. ; use issues for attribute-names, better path-syntax xml-language: make xml-language [ ;verbose: true add-attr: func [name value] [ if none? second parent [parent/2: make block! 2] insert insert tail second parent to-issue name value ] ] -Volker Am Dienstag, 5. August 2003 12:04 schrieb A J Martin:

 [5/31] from: bry:itnisk at: 5-Aug-2003 22:25


a propos the subject, realized after posting earlier today that there was of course the same problem with elements that I saw with textnodes, i.e multiple elements as children of a node, this will still be a problem though, it seems to me if you just used element names, as in body: make object! [ hi I'm some text that's a child of the body node p: "hi this is a paragraph" p: "this is another paragraph" ] obvious problem(s) there. what I was doing before for textnodes was suggesting that there be a naming standard of t{number} so t1: "text" t2: "more text" then whenever one navigated to spot a path, from there one could find out how many textnodes there were, the same solution could be done for elements. also as I indicated before there is the possibility of attributes having namespace prefixes, so on second thought the attributes syntax could be changed to the following: attributes:[ att1:[name: "m:att" value: "here is some text" namespace: "http://www.someuri.com/m"] ] other probs not touched on yet, processing instructions and comments. the question is really if one is wanting to build something good that handles ones own problems, or if one is wanting to build something that can be built on top of later to handle other problems, I don't think it would be too much of a problem to build an xml-to-object that just got the names of elements and attributes and stuff and such and worked in most cases, I don't think however that such a tool would cover all cases (maybe wrong about that, I'm not good enough in Rebol to judge my judgements about anything there, but I am tolerably well informed about xml issues), I think a totally generic tool will not allow a straight rebolpath of xml/workbook/documentProperties there'll have to be a translation step.

 [6/31] from: AJMartin:orcon at: 6-Aug-2003 18:00


Here's my load-XML function. Note that it doesn't get the values the right way around. I've got something wrong in the commented out section of the code. Any one figure out what I'm doing wrong? Load-XML: function [ [catch] "Loads XML as a Rebol compatible block of values." XML [string! file!] "The XML string or file." ] [Content Stack Attribute Value Attribute_Value^ Text^ Declaration^ Name Text Element^] [ Content: make block! 10 Stack: make block! 10 Attribute_Value^: [ WS* copy Attribute [some Alpha opt [#":" some Alpha]] {="} copy Value to #"^"" skip ( insert tail Content reduce [ to issue! Attribute Value ] ) ] Text^: complement charset #"<" Declaration^: [ "<?xml" (Name: 'xml) any Attribute_Value^ WS? "?>" ( Content: reduce [Name Content] ) ] Element^: [ "<!--" thru "-->" | #"<" ( ;push/only Stack reduce [Name Content] ;Content: make block! 6 ) copy Name some Alpha any [Attribute_Value^] WS? [ "/>" | #">" (push Stack Name) [ some [ copy Text some Text^ ( if not empty? trim Text [ insert tail Content reduce [to word! Name Text] ] ) | Element^ ] ] "</" (Name: pop Stack) Name WS? #">" ] ( ;insert tail last first Stack reduce [Name Content] ;set [Name Content] pop Stack ) ] if file? XML [ XML: read XML ] all [ probe parse/all/case XML [ WS? Declaration^ WS? Element^ WS? end ] Content ] ] Push: func [ "Inserts a value into a series and returns the series head." Stack [series! port! bitset!] "Series at point to insert." Value [any-type!] /Only "The value to insert." ][ head either Only [ insert/only Stack :Value ][ insert Stack :Value ] ] Pop: function [ "Returns the first value in a series and removes it from the series." Stack [series! port! bitset!] "Series at point to pop from." ][ Value ][ Value: pick Stack 1 remove Stack :Value ] Fail^: [to end skip] ; A rule that always fails. Succeed^: [] ; A rule that always succeeds. Octet: charset [#"^(00)" - #"^(FF)"] Digit: charset "0123456789" Digits: [some Digit] Upper: charset [#"A" - #"Z"] Lower: charset [#"a" - #"z"] Alpha: union Upper Lower Alphas: [some Alpha] AlphaDigit: union Alpha Digit AlphaDigits: [some AlphaDigit] Control: charset [#"^(00)" - #"^(1F)" #"^(7F)"] Hex: union Digit charset [#"A" - #"F" #"a" - #"f"] HT: #"^-" SP: #" " LWS: charset reduce [SP HT #"^(A0)"] LWS*: [some LWS] LWS?: [any LWS] LF: #"^(0A)" WS: charset reduce [SP HT newline CR LF] WS*: [some WS] WS?: [any WS] Graphic: charset [ #"^(21)" - #"^(7E)" #"^(80)" #"^(82)" - #"^(8C)" #"^(8E)" #"^(91)" - #"^(9C)" #"^(9E)" - #"^(9F)" #"^(A1)" - #"^(FF)" ] Printable: union Graphic charset reduce [SP #"^(A0)"] Integer^: Digits Decimal^: [Digits #"." Digits] Money^: [#"$" Digits #"." 2 Digit] ; A Windows file name cannot contain any of these characters: Forbidden: charset {\/:*?"<>|} Line_End: [newline | end] Blank_Line: [LWS? newline] Blank_Lines: [any Blank_Line] make object! [ Zone: [[#"+" | #"-"] 1 2 Digit #":" 2 Digit] set 'Time^ [1 2 Digit #":" 1 2 Digit opt [#":" 1 2 Digit]] Long-Months: remove map Rebol/locale/Months func [Month [string!]] [ reduce ['| copy Month] ] Short-Months: remove map Rebol/locale/Months func [Month [string!]] [ reduce ['| copy/part Month 3] ] Month: [1 2 Digit | Long-Months | Short-Months] Separator: charset "/-" Day: [1 2 Digit] set 'Date^ [ [ [Day Separator Month Separator [4 Digit | 2 Digit]] | [4 Digit Separator Month Separator Day] ] opt [#"/" [Time^ opt Zone]] ] ] make object! [ Permitted: exclude Printable Forbidden Filename: [some Permitted] Folder: [Filename #"/"] Relative_Path: [some Folder] Absolute_Path: [#"/" any Relative_Path] set 'File^ [ [Absolute_Path opt Filename] | [Relative_Path opt Filename] | Filename ] ] make object! [ Permitted: exclude Printable Forbidden Drive^: [Alpha #":"] Filename^: [some Permitted] Folder^: [Filename^ #"\"] Relative_Path^: [some Folder^] Absolute_Path^: [#"\" any Relative_Path^] set 'Local_File^ [Drive^ Absolute_Path^ opt Filename^] ] Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [7/31] from: AJMartin:orcon at: 6-Aug-2003 21:31


I figured out where I went wrong (I was 'push-ing too early). This function loads XML very nicely. Load-XML: function [ [catch] "Loads XML as a Rebol compatible block of values." XML [string! file!] "The XML string or file." ] [Content Stack Attribute Value Attribute_Value^ Text^ Declaration^ Name Text Element^] [ Content: make block! 10 Stack: make block! 10 Attribute_Value^: [ WS* copy Attribute [ some Alpha opt [#":" some Alpha] ] {="} copy Value to #"^"" skip ( insert tail Content reduce [ to issue! Attribute Value ] ) ] Text^: complement charset #"<" Declaration^: [ "<?xml" (Name: 'xml) any Attribute_Value^ WS? "?>" ( Content: reduce [Name Content] ) ] Element^: [ "<!--" thru "-->" | #"<" Z: copy Name [ some Alpha ( push/only Stack reduce [to word! Name Content] Content: make block! 6 ) ] any [Attribute_Value^] WS? [ "/>" | #">" (push Stack Name) [ some [ copy Text some Text^ ( if not empty? trim Text [ insert tail Content Text ] ) | Element^ ] ] "</" (Name: pop Stack) Name WS? #">" ] ( insert tail last first Stack reduce [ to word! Name either 1 = length? Content [ first Content ] [ Content ] ] set [Name Content] pop Stack ) ] if file? XML [ XML: read XML ] all [ parse/all/case XML [ WS? Declaration^ WS? Element^ WS? end ] Content ] ] With the example from Rebol HQ: XML: { <?xml version="1.0"?> <PERSON> <NAME>Fred</NAME> <AGE>24</AGE> <ADDRESS> <STREET>123 Main Street</STREET> <CITY>Ukiah</CITY> <STATE>CA</STATE> </ADDRESS> </PERSON> } probe X: load-XML XML Here's the result: [xml [#version "1.0"] PERSON [NAME "Fred" AGE "24" ADDRESS [STREET "123 Main Street" CITY "Ukiah" STATE "CA"]]] And with my example XML fragment: XML: {<?xml version="1.0"?> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40"> <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> <Author>Andrew John Martin</Author> <LastAuthor>Andrew John Martin</LastAuthor> <Created>2003-08-05T02:10:56Z</Created> <LastSaved>2003-08-05T02:10:57Z</LastSaved> <Company>Colenso High School</Company> <Version>10.4219</Version> </DocumentProperties> <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> <DownloadComponents/> <LocationOfComponents HRef="file:///\\"/> </OfficeDocumentSettings> </Workbook> } Here's the results of various path! values:
>> x/workbook/documentproperties/author
== "Andrew John Martin"
>> x/workbook/officedocumentsettings/#xmlns
== "urn:schemas-microsoft-com:office:office" :) Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [8/31] from: AJMartin:orcon at: 6-Aug-2003 21:53


> Load-XML: function [
'Load-XML can't handle multiple content and elements. For example: XML: {<?xml version="1.0"?> <document> <h>Heading</h> <p class="Initial">Hi this is a <b>bold</b> paragraph.</p> <p>this is a <span>stuff</span> paragraph</p> </document> } probe X: load-XML XML [xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this is a" b "b old" "paragraph."] p ["this is a" span "stuff" "paragraph"]]]
>> x/document/h
== "Heading"
>> x/document/p
== [#class "Initial" "Hi this is a" b "bold" "paragraph."] Access to second 'p is a bit tricky... Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [9/31] from: AJMartin:orcon at: 6-Aug-2003 22:07


Hi, Volker. You wrote:
> Idea is to name tag as email, attributes as issue. > then one could write XML/Workbook@/#myAtribute too.
Thanks for the idea, Volker! I managed to do without convert tag names to email!. Am I missing something important? ::worried look:: Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [10/31] from: bry:itnisk at: 6-Aug-2003 12:09


How's it handling instances of nodes with the same name, and multiple text nodes interlaced in the tree, also I've put in a processing instruction, the idea behind processing instructions is that they can be used to pass info to the processor, so in this case I put in some simple rebol code as an example PI : <?xml version="1.0"?> <doc> here's some text <p>para 1</p> <?rebol-process call: "stringvalue" print call?> <p>para 2</p> here's some more text <p:p xmlns:p="http://www.uris.org/p">not a para</p:p> </doc> I'd try it here but I'd have to go through and fix some formatting errors Outlook put into the Rebol code. Hmm, I should probably use Rebol to access mail from this list anyway.

 [11/31] from: AJMartin:orcon at: 6-Aug-2003 22:20


Hi, Brett! You wrote:
> It has been a while since I looked at it, but I thought that the
xml-object that Gavin wrote did build a tree of objects. However I seem to recall that the parser was not entirely complete. Gavin's comments in his scripts indicate that he was going for the SAX event model approach? I didn't look any further than that (call me slack...).
> > XML/Workbook/DocumentProperties/Author: "Andrew Martin" > How many instances of these would be literally in your code, or would you
need to build them up in other code? I'm uncertain about that at this point. I know that I can relatively easily build up path! values in Rebol.
> If we write a program to access a cell from the row in this example are we
implicitly encoding the structure of XML (DTD) into a program? I think that's unavoidable; after all some program eventually has to understand the structure to work with it. I feel that the goal for 'Load-XML is get rid of a lot of "drag" in checking for end-tags and so on; refactoring out the common code in converting XML to Rebol values.
> By the way does namespaces represent contracts for behaviour, or are they
like rebol objects - a way to provide seperate contexts, or are they both? I know very little about namespaces. I think they could be modeled as a context (or object!) in Rebol.
> I think that if you consider the different ways you might want to process
the XML you will find different representations that can be useful. Can anyone think of alternative representations of a XML document? There is: * Text; * SAX or event model; * DOM or tree model; * _________________;
> Also if a DTD (are these obsolete yet?) is available - the extra
information might have an impact. E.g using a DTD maybe a program could be generated that knows how to traverse the XML implicitly (because it was generated to do so). I think a DTD parser could be very helpful in writing a generic XML script that does generic things.
> Not being much help am I? :^)
Thanks, Brett! :) Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [12/31] from: AJMartin:orcon at: 6-Aug-2003 22:31


Bryan wrote:
> How's it handling instances of nodes with the same name, and multiple text
nodes interlaced in the tree, Given this XML: XML: {<?xml version="1.0"?> <document> <h>Heading</h> <p class="Initial">Hi this is a <b>bold</b> paragraph.</p> <p>this is a <span>stuff</span> paragraph</p> </document> } 'Load-XML produces: [xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this is a" b "b old" "paragraph."] p ["this is a" span "stuff" "paragraph"]]] Which is OK for text processing, I feel.
> also I've put in a processing instruction, the idea behind processing
instructions is that they can be used to pass info to the processor, so in this case I put in some simple rebol code as an example PI :
> <?rebol-process call: "stringvalue" print call?>
I haven't seen this before now. Is it something you've just thought of (and so we can change it to better suit Rebol and users)? Or is this related to some XML standard? If it's something we can change, I think I'd like to see it as: <?Rebol [print 3.14159]?> Or perhaps I've misunderstood? I tried your example: XML: { <?xml version="1.0"?> <doc> here's some text <p>para 1</p> <p>para 2</p> here's some more text <p:p xmlns:p="http://www.uris.org/p">not a para</p:p> </doc> } and discovered that 'Load-XML doesn't handle tags in a name-space. :( Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [13/31] from: AJMartin:orcon at: 6-Aug-2003 22:44


A better version of 'Load-XML: Load-XML: function [ [catch] "Loads XML as a Rebol compatible block of values." XML [string! file!] "The XML string or file." ] [ Content Stack Attribute Value Attribute_Value^ Namespace_Name^ Text^ Declaration^ Name Text Element^ ] [ Content: make block! 10 Stack: make block! 10 Namespace_Name^: [Alpha any AlphaDigit opt [#":" Alpha any Alphadigit]] Attribute_Value^: [ WS* copy Attribute Namespace_Name^ {="} copy Value to #"^"" skip ( insert tail Content reduce [ to issue! Attribute Value ] ) ] Text^: complement charset #"<" Declaration^: [ "<?xml" (Name: 'xml) any Attribute_Value^ WS? "?>" ( Content: reduce [Name Content] ) ] Element^: [ "<!--" thru "-->" | #"<" Z: copy Name [ Namespace_Name^ ( push/only Stack reduce [ to word! Name Content ] Content: make block! 6 ) ] any [Attribute_Value^] WS? [ "/>" | #">" (push Stack Name) [ some [ copy Text some Text^ ( if not empty? trim/lines Text [ insert tail Content Text ] ) | Element^ ] ] "</" (Name: pop Stack) Name WS? #">" ] ( insert tail last first Stack reduce [ to word! Name either 1 = length? Content [first Content] [Content] ] set [Name Content] pop Stack ) ] if file? XML [ XML: read XML ] all [ parse/all/case XML [ WS? Declaration^ WS? Element^ WS? end ] Content ] ] With Bryan's test XML (modified by me!): XML: { <?xml version="1.0"?> <doc> here's some text <p>para 1</p> <p>para 2</p> here's some more text <p:p xmlns:p="http://www.uris.org/p">not a para</p:p> </doc> } I'm getting: [ xml [#version "1.0"] doc [ "here's some text" p "para 1" p "para 2" "here's some more text" p:p [#xmlns:p "http://www.uris.org/p" "not a para"]]] Note that the "p:p" is converted to a word! value, currently. I feel it should be a path! value, like this: p/p [xmlns/p "http://www.uris.org/p" "not a para"] As too should the attribute names as well. What do people think? Andrew J Martin ICQ: 26227169 http://www.rebol.it/Valley/ http://valley.orcon.net.nz/ http://Valley.150m.com/

 [14/31] from: bry:itnisk at: 6-Aug-2003 13:49


[ XML: {<?xml version="1.0"?> <document> <h>Heading</h> <p class="Initial">Hi this is a <b>bold</b> paragraph.</p> <p>this is a <span>stuff</span> paragraph</p> </document> } 'Load-XML produces: [xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this is a" b "b old" "paragraph."] p ["this is a" span "stuff" "paragraph"]]] Which is OK for text processing, I feel.] I guess so, shouldn't it give p["this is a" span ["stuff"] "paragraph"] though.
> <?rebol-process call: "stringvalue" print call?> >I haven't seen this before now. Is it something you've just thought of
(and
>so we can change it to better suit Rebol and users)? Or is this related
to
>some XML standard?
Processing Instructions aren't used that much, according to the xml standard a PI is supposed to be passed to the processing application, which can then do with it what it will, ignore it, parse it for other stuff. An application which does make use of PIs is Apache Cocoon. A common PI is the stylesheet PI, something like <?xml-stylesheet href="some.xsl"?> (been a while since I used that, link here http://www.w3.org/TR/xml-stylesheet/) Some people don't like Processing Instructions and would like to get rid of them in the next version of XML, but it doesn't look like they will be gotten rid of. So a PI can be used to pass evaluatable code if that's what one wants to do with it. A PI is anything with a structure <?....?> not at the top of the document, that is to say the xml declaration <?xml version="1.0"?> is not a PI despite the structural similarity.
>If it's something we can change, I think I'd like to see it as:
yeah it can be changed. As long as it has <? ?> structure it can contain anything between the ? ?
>and discovered that 'Load-XML doesn't handle tags in a name-space. :(
yeah I thought there might be problems, another problem is in the rare occurrence of namespace prefixed attributes. In fact I think I've seen some Excel Workbooks with that.

 [15/31] from: sqlab:gmx at: 6-Aug-2003 14:02


Hello Andrew I would propose something like
>Get-XML x/document/p
is the same as ">Get-XML x/document/p/1" Then we can do
>Get-XML x/document/p/2
.. ..
>Get-XML x/document/p/:n
with a function similar to Get-XML: func ['x [path ..] [...] Or just an XML: context [ load: func [... set: func [.. get: func [ ] AR
> > Load-XML: function [ > 'Load-XML can't handle multiple content and elements. For example:
<<quoted lines omitted: 24>>
> To unsubscribe from this list, just send an email to > [rebol-request--rebol--com] with unsubscribe as the subject.
-- COMPUTERBILD 15/03: Premium-e-mail-Dienste im Test -------------------------------------------------- 1. GMX TopMail - Platz 1 und Testsieger! 2. GMX ProMail - Platz 2 und Preis-Qualitätssieger! 3. Arcor - 4. web.de - 5. T-Online - 6. freenet.de - 7. daybyday - 8. e-Post

 [16/31] from: bry:itnisk at: 6-Aug-2003 15:03


>yeah I thought there might be problems, another problem is in the rare >occurrence of namespace prefixed attributes.
Actually considering not too rare in cases of xml coming out of academia, or certain standards, where the xml namespace will be used, especially in example like following: <?xml version="1.0"?> <doc> <p xml:lang="EN">Good Morning!</p> <p xml:lang="EN-US">Howdy!</p> </doc> note that the xml namespace does not need to be declared anywhere.

 [17/31] from: bry:itnisk at: 6-Aug-2003 15:26


Two last things that would need to be handled, in order of priority it seems to me: Cdata sections Syntax: <![CDATA {can be anything in here including malformed xml} ]]> http://www.w3.org/TR/REC-xml#sec-cdata-sect And comments http://www.w3.org/TR/REC-xml#sec-comments

 [18/31] from: andrew:martin:colenso:school at: 7-Aug-2003 10:06


The attached version of 'load-XML.r now handles these cases. (It doesn't yet handle name spaces correctly yet, though!) Andrew J Martin Attendance Officer & Information Systems Trouble Shooter Colenso High School Arnold Street, Napier. Tel: 64-6-8310180 ext 826 Fax: 64-6-8336759 http://colenso.net/scripts/Wiki.r?AJM http://www.colenso.school.nz/

 [19/31] from: andrew:martin:colenso:school at: 7-Aug-2003 10:18


Bryan wrote:
> > p ["this is a" span "stuff" "paragraph"] > ...shouldn't it give p ["this is a" span ["stuff"] "paragraph"]
though? I deliberately did that in this line: either 1 = length? Content [first Content] [Content] So as to make it easier and simpler to use.
> another problem is in the rare occurrence of namespace prefixed
attributes.
> In fact I think I've seen some Excel Workbooks with that.
What do you think of using a path! value for these? Andrew J Martin Attendance Officer & Information Systems Trouble Shooter Colenso High School Arnold Street, Napier. Tel: 64-6-8310180 ext 826 Fax: 64-6-8336759 http://colenso.net/scripts/Wiki.r?AJM http://www.colenso.school.nz/ DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally liable) for materials distributed to or acquired from user e-mail accounts. You can report any misuse of an e-mail account to our ICT Manager and the complaint will be investigated. (Misuse can come in many forms, but can be viewed as any material sent/received that indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate language and/or other issues described in our Acceptable Use Policy.) All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0 Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]

 [20/31] from: andrew:martin:colenso:school at: 7-Aug-2003 10:21


Curse that Listar! -- Binary/unsupported file stripped by Listar -- -- Type: application/octet-stream -- File: Load-XML.r -- Desc: Load-XML.r
>> probe compress read %/c/Rebol/Values/Load-XML.r
#{ 789C9D55516FDB36107E967EC55945B07898A5641936C04BE778498B65483AA3 3192018252D0122311964981A4E6BACB7ED0FEE58E47C9769C3ECD0F967877BC BBEFBB8F541A7EE40B55431A02FE3EB0151FC337378A15A33F6F6FC8F65ED468 3BEA6DB126EB5C58678E9C19D01C9175DADA4A69344FE177B865DA0AE91DFC96 897A0C54EB42E95CC958721BCB2FE47DE08B3154D636E32459AFD7B17661B1B0 C93DAB6BBE4928E88FB5E494FA8BE5B954B52A373EF7475156D6A0E752351BED 16F02F7C7F727206BB2EBE83BD6DB1DF37670BDCF503BDCF5ADD28D303320E11 3003CC770CB95A35CC8A45CD6151AB7C09EA09FE6275CB4D97EB86C9B265A523 EF9D2C6B612ACF47BE946A5DF3A2E42B2E5D939EE7205AE80D9311E0E3425829 CC32C61A9DEF5ED54BAEE183B026AF2290F41C61526B2E90B5DC81903C2E7817 FFABE6D6C26F4C164896CB89CB8B5C151C6936B950ADD925CFFC9C392F5C3347 C8D96AA524CC98361CEE3D241DC1D1AC3555ACF1A99A58FB4D57CC22BE9F9269 5BB6C6268E61B2DF736D849263388DCFE213326561D8EB650C4FADCC2D0620F6 20CD99CDABECFFD01CB8E0D4582D6439802754E500F3CC2B4E59BC1D94268F0B CF5CBD4B252D120F779661BAA9C5A8456B3BA4BBF5275A3F92FE4DC372FEC9BD 3D86C19C7FB68F70C5F39A69E640F81870767857D3541F5FD61AC38A2D3B0003 383D09032AFEDA7C500CC731AD9B8A01931BA0B72B510A0BAAB190BE89C6111C B80BE7CEB2303884E15416040F77DF22A1CD660FF54145F8FB6DF48F8FF17C58 056FA2C72802B3140D1C63924048C3B5058BC7177A32352FDA9C539120C03DC2 98960F7675C84E19E92D73FFC3905E9CDC1C65D862743E188D22B0956E211A8D 7E89D03DBBF69E496F9FA039F819665AE5DC1837E16B89A36E4950068E67D743 609A8328A5C2B6E26E626392921F0FE49513B74568E7288BFD597AA2B0DCE755 1DC17177FBE162E8593E94C7C3DD845AF2D46CE7DDF341CAE8ACFBA07B9DF872 AE0D4F3AC5138B8783A1FC4183673051B2DE74F2EDEB20E56BA58B01BC2AF875 0DFE482ED70C0AD5014B0F9165048D5A8912C4F78C643998AE83AEB8AB35EC66 6E54DF78B09D28EEC1E9C1B3B7A604904E0905FB73F40C38F4F4F26A3A9FA61D 07148280A22CC38267A4BCACC38FEA7B02A92CF055633713C09E57492D24377E 5BD7C1D745EA227A7FD6BD0CFDF379EFE4EEDC78999C275B1534AAF1C0879E65 C78FE3C493F8EA68D4CC58BC79B4B107B3EACFC86E6064E1C25678C59FC25BA8 B92C6D35D9369EFA34FD58217D3160FA776A7EA9B65DBF7BBA43F6DC6538A10B D27542D7B1E6FE8BED63F0034BAEC69D91045749CE0CDFEE081CEEFD23D3DBF6 19746B2E0B948B560BDED11676CD761DFA6AF85DC8C2FF007B9F7DD773080000 } Andrew J Martin Attendance Officer & Information Systems Trouble Shooter Colenso High School Arnold Street, Napier. Tel: 64-6-8310180 ext 826 Fax: 64-6-8336759 http://colenso.net/scripts/Wiki.r?AJM http://www.colenso.school.nz/ DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally liable) for materials distributed to or acquired from user e-mail accounts. You can report any misuse of an e-mail account to our ICT Manager and the complaint will be investigated. (Misuse can come in many forms, but can be viewed as any material sent/received that indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate language and/or other issues described in our Acceptable Use Policy.) All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0 Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]

 [21/31] from: andrew:martin:colenso:school at: 7-Aug-2003 14:07


This version handles name spaces better. [ Rebol [ Name: 'Load-XML File: %Load-XML.r Title: "Load XML" Author: "A J Martin" eMail: [Rebol--orcon--net--nz] Web: http://www.rebol.it/Valley/ Owner: "Aztecnology" Rights: "Copyright (c) 2003 A J Martin, Aztecnology." Tabs: 4 Purpose: "Loads XML as a Rebol compatible block of values." Language: 'English Acknowledgements: [ "bryan" [bry--itnisk--com] "Volker Nitsch" [nitsch-lists--netcologne--de] "Brett Handley" [brett--codeconscious--com] ] Needs: [%"Common Parse Values.r" %Push.r %Pop.r] Date: 7/August/2003 Version: 1.5.0 ] make object! [ Namespace_Name: func [Name [string! word! path!] Type [datatype!]] [ either string? :Name [ either found? find Name #":" [ to path! map parse/all Name ":" func [Word [string!]] [ to word! Word ] ] [ to Type Name ] ] [ :Name ] ] NameChar: union AlphaDigit charset ".-_:" Name^: [[Alpha | #"-" | #"_"] any NameChar] Comment^: ["<!--" thru "-->"] PI^: ["<?" thru "?>"] Miscellaneous^: [any [PI^ | Comment^]] set 'Load-XML function [ [catch] "Loads XML as a Rebol compatible block of values." XML [string! file!] "The XML string or file." ] [ Content Stack Attribute Value Attribute_Value^ Text^ Declaration^ Name Text Element^ Miscellaneous^ ] [ Content: make block! 4 Stack: make block! 10 Attribute_Value^: [ WS* copy Attribute Name^ #"=" [ #"^"" copy Value to #"^"" skip | #"'" copy Value to #"'" skip ] ( insert tail Content reduce [ Namespace_Name Attribute issue! Value ] ) ] =09 Text^: complement charset #"<" Declaration^: [ "<?xml" (Name: "xml") any Attribute_Value^ WS? ?> ( Content: reduce [ Namespace_Name Name word! Content ] ) ] Element^: [ #"<" copy Name [ Name^ ( push/only Stack reduce [ Namespace_Name Name word! Content ] Content: make block! 6 ) ] any [Attribute_Value^] WS? [ "/>" | #">" (push Stack Name) [ some [ Comment^ | PI^ | Element^ | "<![CDATA[" copy Text to "]]>" 3 skip ( if not empty? trim Text [ insert tail Content Text ] ) | copy Text some Text^ ( if not empty? trim/lines Text [ insert tail Content Text ] ) ] ] "</" (Name: pop Stack) Name WS? #">" ] ( insert tail last first Stack reduce [ Namespace_Name Name word! either 1 = length? Content [first Content] [Content] ] set [Name Content] pop Stack ) ] =09 if file? XML [ XML: read XML ] all [ parse/all/case XML [ WS? Declaration^ WS? Miscellaneous^ WS? Element^ WS? Miscellaneous^ WS? end ] Content ] ] ] ] XML: {<?xml version="1.0"?> <document> <h>Heading</h> <p class="Initial">Hi this is a <b>bold</b> paragraph.</p> <p>this is a <span>stuff</span> paragraph</p> <![CDATA[<greeting>Hello, world!</greeting>]]> here's some more text <!-- A comment! --> <p:p xmlns:p="http://www.uris.org/p">not a para</p:p> <PERSON foo="bar"> <NAME bar="blech">Fred</NAME> <AGE>24</AGE> <ADDRESS> <STREET>123 Main Street</STREET> <CITY>Ukiah</CITY> <STATE>CA</STATE> </ADDRESS> </PERSON> </document> } [xml [#version "1.0"] document [h "Heading" p [#class "Initial" "Hi this is a" b "bold" "paragraph."] p ["this is a" span "stuff" "paragraph"] <greet ing>Hello, world!</greeting> "here's some more text" p/p [xmlns/p http:// www.uris.org/p "not a para"] PERSON [#foo "bar" NAME [#bar "blech" Fred ] AGE "24" ADDRESS [STREET "123 Main Street" CITY "Ukiah" STATE "CA"]]]] Andrew J Martin Attendance Officer & Information Systems Trouble Shooter Colenso High School Arnold Street, Napier. Tel: 64-6-8310180 ext 826 Fax: 64-6-8336759 http://colenso.net/scripts/Wiki.r?AJM http://www.colenso.school.nz/ DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally liable) for materials distributed to or acquired from user e-mail accounts. You can report any misuse of an e-mail account to our ICT Manager and the complaint will be investigated. (Misuse can come in many forms, but can be viewed as any material sent/received that indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate language and/or other issues described in our Acceptable Use Policy.) All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0 Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]

 [22/31] from: andrew:martin:colenso:school at: 7-Aug-2003 14:24


> This version handles name spaces better.
And this version has better formatting.
>> probe compress read %/c/Rebol/Values/Load-XML.r
#{ 789CA5566D6FDB3610FE2CFF8A338320F5304BC9BA174048EB78498B6D48BAA0 319201861CD0126369964981A4E6BAE81FDABFDC9127C97692A5C0960F117577 3C3EF7F0B993A7BD8F62AE4A98F600FF3EF09588E1E852F16CF8C7D5A5B7BD2F 4AB41DB6B6507BEBA4B0CECC9C19D0CCBC755CDB5C69348FE137B8E2DA16921C E28A17650CFEAC33A5532543296C283F7BEF9D98C7905B5BC551B45EAF43EDC2 C2C246B7BC2CC526F241BFAFA5F0A93F5B914A55AAC586727F2C16B935E83957 D546BB17F81BBE3B3E7E0D5B14DFC2CEB690F64DF81C777DEFD7D7B5AE94690B 32AE22E00638218654AD2A6E8B7929605EAA7409EA01FEE2652D4C93EB92CB45 CD178EBC7772511626273ED2A554EB52640BB112D281249E0336D71B2E19E0E3 ACB0B230CB10CF687CB7AA5C0A0D1F0A6BD29C81F4CF2126B5E60C594B5D1152 849968E27FD6C25AF885CB0CC97239F1F52C5599409A4D5AA8DA6C932774CF42 640ECC2172B65A2909D75C1B01B754926670785D9B3CD4F85455A869D305B758 DF4FD1B85ED4C6468E616FBF15DA144AC67012FE101E7B53D2EBADF852809AFF 2952DBC7AA03272D53F154DC93C81E6A99C2D4AD616AAC2EE4A20F6BA5B33E20 D3793F81C9A64257C62DB7B8EA2789CB1288C2E6C80DED18414C09D0D17A1E54 2DB3113C1432F372860316338A08ACA2E4B0E215AEB0E4080546612E8A30DD21 8A0E53732C6D267CCE4FA6C43F926D768FD9A5EB75EEC61BB756674C888EF39C A3A06B89E4C1B8AC727E512C0A0B69EE905960E1F03E66143AC3CB9AFA18F882 150D997FDCB304B8DC409B0CF3BAFB44A5B97876DA1F62A0CD750D6C387CCBD0 7DFD2B7946AD7DE4CD57854945597229502C2EC2659D62301ED3664C30CEC1EA A683A7CB3AF0AEC269CA6D9A27FFA58182C045772A78C08183F7CF26B9F069C8 0E4A7B8F8F2752CF95B4880C6E2C4F97F83EB61839AF6D2364E8DEEFFDFB0C43 26E2939DC185484BAEB9C33EA3DB77767857FA2E9DC13E1D8FCF8BC16BDBD7D1 C7F911041EC0BEF9E47817500320262DDCDD7C8384549B2D400F628637FAA695 EA019B314651540CAA8B6C6659543EC429E0E869CCD14E4802AFFCB39046680B 168730B4AC6991D5A968D5BDDF9E3BC80A636AD1A7207FCAAEF4075ED141D032 1BFB8B261A3B1D1FB0537767BBA43744A00E3FAD4A06AF682430F732F08A7ECC 1CDCDD8CBC589B82BAAB78B10AFFCF372DB99B5D4F2BC07FEDE537D01C68A276 3B61A8131B044185133252B2DC90FE1E21F91A947D2C0D98E715F623F906CD8D FAD67CCC4FE209A2B359F496C68363CBA16C003A0883169F51AB2DD6B6C57117 F57C4B46E3FF02384BA6E717E3C978DAB0E21B06F5C692048F79ED25D732837A 7B00A92C885565372340A82BDAD01EF8BC225D481790B4AB4107627BB0474FBD FCC29951594861FEFFC98D0147D269D489B55215F13AA0BB75FC3BCAFFB5EF4A 6E2C4E306DECB382F98A5E9AAFDB09BC8152C885CD471DF829256D5E7154B5AB 5D9DBB46A4AF6D17D755B0DF083D4FA51BB5233F7E3D3E5CB85EA3DF7A6DA4FB 767A6FF7298D526EC47657E048D9EDFBCEF864C292794F772FC409996DBFB03B 9DD47C5CF1E747D2FB07F7D30821DA0A0000 } Andrew J Martin Attendance Officer & Information Systems Trouble Shooter Colenso High School Arnold Street, Napier. Tel: 64-6-8310180 ext 826 Fax: 64-6-8336759 http://colenso.net/scripts/Wiki.r?AJM http://www.colenso.school.nz/ DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally liable) for materials distributed to or acquired from user e-mail accounts. You can report any misuse of an e-mail account to our ICT Manager and the complaint will be investigated. (Misuse can come in many forms, but can be viewed as any material sent/received that indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate language and/or other issues described in our Acceptable Use Policy.) All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0 Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]

 [23/31] from: andrew:martin:colenso:school at: 7-Aug-2003 15:11


Will wrote:
> this doesn't: > XML: {<?xml version="1.0"?>
<<quoted lines omitted: 3>>
> } > Am I missing something?
Wait! Let me try:
>> XML: {<?xml version="1.0"?>
{ <docu-ment> { <h>Heading</h> { </docu-ment> { } == {<?xml version="1.0"?> <docu-ment> <h>Heading</h> </docu-ment> }
>> load-xml xml
== [xml [#version "1.0"] docu-ment [h "Heading"]] Seems OK here. May be it's the version you've got? Or perhaps you're missing one of my support Values?
>> probe compress read %/c/Rebol/Values/Load-XML.r
#{ 789CA5566D6FDB3610FE2CFF8A338320F5304BC9BA174048EB78498B6D48BAA0 319201861CD0126369964981A4E6BAE81FDABFDC9127C97692A5C0960F117577 3C3EF7F0B993A7BD8F62AE4A98F600FF3EF09588E1E852F16CF8C7D5A5B7BD2F 4AB41DB6B6507BEBA4B0CECC9C19D0CCBC755CDB5C69348FE137B8E2DA16921C E28A17650CFEAC33A5532543296C283F7BEF9D98C7905B5BC551B45EAF43EDC2 C2C246B7BC2CC526F241BFAFA5F0A93F5B914A55AAC586727F2C16B935E83957 D546BB17F81BBE3B3E7E0D5B14DFC2CEB690F64DF81C777DEFD7D7B5AE94690B 32AE22E00638218654AD2A6E8B7929605EAA7409EA01FEE2652D4C93EB92CB45 CD178EBC7772511626273ED2A554EB52640BB112D281249E0336D71B2E19E0E3 ACB0B230CB10CF687CB7AA5C0A0D1F0A6BD29C81F4CF2126B5E60C594B5D1152 849968E27FD6C25AF885CB0CC97239F1F52C5599409A4D5AA8DA6C932774CF42 640ECC2172B65A2909D75C1B01B754926670785D9B3CD4F85455A869D305B758 DF4FD1B85ED4C6468E616FBF15DA144AC67012FE101E7B53D2EBADF852809AFF 2952DBC7AA03272D53F154DC93C81E6A99C2D4AD616AAC2EE4A20F6BA5B33E20 D3793F81C9A64257C62DB7B8EA2789CB1288C2E6C80DED18414C09D0D17A1E54 2DB3113C1432F372860316338A08ACA2E4B0E215AEB0E4080546612E8A30DD21 8A0E53732C6D267CCE4FA6C43F926D768FD9A5EB75EEC61BB756674C888EF39C A3A06B89E4C1B8AC727E512C0A0B69EE905960E1F03E66143AC3CB9AFA18F882 150D997FDCB304B8DC409B0CF3BAFB44A5B97876DA1F62A0CD750D6C387CCBD0 7DFD2B7946AD7DE4CD57854945597229502C2EC2659D62301ED3664C30CEC1EA A683A7CB3AF0AEC269CA6D9A27FFA58182C045772A78C08183F7CF26B9F069C8 0E4A7B8F8F2752CF95B4880C6E2C4F97F83EB61839AF6D2364E8DEEFFDFB0C43 26E2939DC185484BAEB9C33EA3DB77767857FA2E9DC13E1D8FCF8BC16BDBD7D1 C7F911041EC0BEF9E47817500320262DDCDD7C8384549B2D400F628637FAA695 EA019B314651540CAA8B6C6659543EC429E0E869CCD14E4802AFFCB39046680B 168730B4AC6991D5A968D5BDDF9E3BC80A636AD1A7207FCAAEF4075ED141D032 1BFB8B261A3B1D1FB0537767BBA43744A00E3FAD4A06AF682430F732F08A7ECC 1CDCDD8CBC589B82BAAB78B10AFFCF372DB99B5D4F2BC07FEDE537D01C68A276 3B61A8131B044185133252B2DC90FE1E21F91A947D2C0D98E715F623F906CD8D FAD67CCC4FE209A2B359F496C68363CBA16C003A0883169F51AB2DD6B6C57117 F57C4B46E3FF02384BA6E717E3C978DAB0E21B06F5C692048F79ED25D732837A 7B00A92C885565372340A82BDAD01EF8BC225D481790B4AB4107627BB0474FBD FCC29951594861FEFFC98D0147D269D489B55215F13AA0BB75FC3BCAFFB5EF4A 6E2C4E306DECB382F98A5E9AAFDB09BC8152C885CD471DF829256D5E7154B5AB 5D9DBB46A4AF6D17D755B0DF083D4FA51BB5233F7E3D3E5CB85EA3DF7A6DA4FB 767A6FF7298D526EC47657E048D9EDFBCEF864C292794F772FC409996DBFB03B 9DD47C5CF1E747D2FB07F7D30821DA0A0000 } You'll also need:
>> probe compress read %"/c/Rebol/Values/Common Parse Values.r"
#{ 789CAD566D53DB4610FEACFB151BD10C21AD2D636801B5D3C4059CA4631A0FD0 3013233367F9B055E43B8D748A634A7F50FF6577EFF4666A7F6A3FC068F7D9D7 E7F6F63C629762A2621831E737BE103EEC9EAAC542C9D690A799687DE2712E32 E65C473A46D0B52018102CE832A71F11F87213DA4E11EFE57AAE5274EFC1AF70 C1531D49D47E5C4A61948F5A8452C56AB642ED65349BEBCCA44A562909F03774 3B9D03A89DBF83864F1B9DC4058F621F4C2B6F551A2AD99642B7E523736EC4C4 87B9D689EF79CBE5B29D924D3BD21ED6178B95877D0B31C584A397173C69A701 36CB27281F326798A789CAB6F73DE07296F319B1762E677194CD9973C635CA07 5E2F9FE599F6A872E67C12691629E943B7DD6D77981330D6C78AC798552B1072 0AD9439404CE8FD8649AC702F49C6BE0F192AF32B847CBACCDAEF230C44AC967 8B61662DD0F663A885F6219C53BD1A463BEEF855A7B3E7420BE8B3DFDF730376 16CDA28691DBD9EF1E1C7EFFC3D1F1896B312225530B01460AD8EF494207D688 DAB3113F63B4815A3E03B9051F11ECC5C99CFB904B24014C1830F616A8F218A9 B02EAAB32E4663CB68A0EB7E4591A74AEA54C5DB9BDFC7E6CDC79161E1BDF85A 66311136B4D727FBA29B7B72B9F62940CB655743FA02970D6EAEEA8CA998E6A1 80D1D510DE5F9B54BD0EA542A3D765C9F86D146F50C1E5AA90FB26F0AB4E6FCF 65DB234AB18C2329E0F41206FD8035A252903A2649EF529ECCA3B04107732845 77BF22E4E81CB319E571A7FAEA56F0F169A5AC0C4F6AEF930A3E39AF95FD52D9 DB5F9B3A1AFD611A49CD27B4322CEF458D9BBAADC9631FA4163391E2FCDBC367 67228C16DC5C22AB41EBB65BA001BB5052AC08DC71BF29B5D6A25B0E34A36B74 13C9A95AE235C32506127720845C4A85838093C4230944A5BAC79B26F0EE538D 1C6F579AF9ACAFD249349D0A59B3FBE7ADE7BF7EE3FEF4F3D35F8C0DF090EECE E5146B288FEC892E7BC07E89B97CB8231C319A82F24C9B50561C63ADC18A17FC 41809AFC2142FD82CEF2B33231B0C96F5D8CBEE3B6DC00F6CB1651F61BFD3A54 E2EE75B41063183D33AA459598E16FEA02F41D28396B21A97A8E85A562A1BE08 58F0C42E5D2F56218F856771B8CF65082323E0646A3CF0D98B20A07A9DF27077 9F90E06405C608C3E3603857F84AE8FF3F8797E0AB6113C14191CA487E938527 68748852B318AA4D6014AE5563C7B91EAE005CF8AB669C92657A07C6A61AFA73 466806558CA2985A1E1D56553428779CA71AD9EE8CA18D31FD2B0ECF736164CF 9914342441D1F886191A8A7411692D7050C5D730CEA702AA3B0AD590DB675E9A 1F0976DB547E18B6AFE229EDFF5169045404029722E63AFA22EE86DC106E3CAD 35A2BD49A6E25C57A8A99C867ECDADE49462179CAE399A26CBC4D427D2B61660 834129FD6756CE52CC427BC63E527471829AABF176B2C60DB6C6E877FB2FBAC6 EB7C8D9F1366B7DBED06C6C6256503BA33770571B654588FB1C6CDD87211B07F 00243E18D2180A0000 }
>> probe compress read %"/c/Rebol/Values/Push.r"
#{ 789C9D50CB4EC330103CDB5FB18D84B880531E97E644C543A2A280DAAA1CAA1C DC76492C8C1D391BAA80F820FE92750A2A881BC7999DDD999D859CE0D25B5848 71AB9F3183FDFBA62EA59819B28C92881229AE4C447B1D5481896143A50F2C18 C208C63A9071CCDE6D1C76E42BE1CA79EB8B96D989294AAA993EF7551B22800F 38EEF74F60B77C003F76142FE1581B9B4197EECC879577CA2129F72AC5032E33 2889AA2C4D379B8D0A51A30CA5736D2DB629A7D74BF63B95E2BE0995AF39FADB B5AB31500D1A5EB46D108C23CF804983CCBA3504A426B81AA8C46FBA44BD56EF 52DC685734BA88F55CBAC29AD8D08526C6478374D4D8368DEF4831C7501BEF98 567D35902297323696C163E356B1E3E4FF31B89329E9D5132CB66C0F2A1FA807 4B4335522F17C9F4EB08F184EF025F369D5DDC9D77760BEDDA436A2BECE590DE 39DB8A64C63EDB2CBFF439A78DBE8086930488E2F881D84A521FF13650D6DDE6 51BE9BFF197565E4F2137382C5E370020000 }
>> probe compress read %"/c/Rebol/Values/Pop.r"
#{ 789CAD504D4F0231103D777EC54062BC982EA817F624F1E3404409183C100E05 CB6E636937ED2C048C3FC87FE97457233FC0DBBC37F3DEBC99054CF5CA5B5C80 78525B9DE3F9C457205E0C59065D065D100F2681B38464603CACA9F481DB431C E15805328ED9E7BDD30D7924BD76DEFAE2C0ECD4142545A66F7D750809E0175E F67A57F827BEC0138D64911E2B63736CA2DDF8B0F64E3A4DD21D41BCEA558E25 519567D97EBF9721CD4843D95C59AB0F1967572BDE770D625287CA474EFE31D5 540717914A8D1B1322E14ED95AA371A830EA607444E5DE30E8ADDF716D0837C1 6F9BF9B62D3F413C2A57D4AA484FBA778535B10471A788717F908D6A7BC8D25D 20E63A44E31DD3B22707209600FCB91C37B55B1337D2B3BBFF10891F3523B57E C7454B74B0F2813AB83214357596A23BFBF121EE1847489E8BAA314AEA250799 A7AD27658E9561CBD6B80FA2DDDF6210F9EF38C012BE017850B5B63C020000 }
>> probe compress read %"/c/Rebol/Values/Map.r"
#{ 789CA554CD4EDC30103E3B4F311B092D54DD840DD0432E94F2A31641A9680587 550EDEC49BB878EDC871D806C403F52D3BB69305AA02872A519C197FF3CDAF3D 0B2ED95C099805E42B5DB214C6E7B40EC80F6E040A210A61404EB81536AC1469 940F5A53298DDB07700AE7541B2E517BB192CC29EF0CCBA512AAEC507BC9CBCA 34A83E5475A7AD00BF21D9DEDE8147E3F7F0C4264223764EB948C185F651E95C C9483213C9BB805CB3790A9531751AC7ABD52AD2161371135F51215817631E8C 15E870B671A0CB76C9A469229D614A748EDADD807C6B75AD1A4CE81EF3694069 A0752D386BC0540C16ADCC0D57128C0264042698E300B570FB0DD3088D1E0272 4665D9D2D2D6EC5896823755408E7FD1656D8B751F1082F48E0E6612DF763967 7A9465567A0712D72924B093418F1CC4572C9E216117F6E0C380A788AA3597C6 FEBE8E84F9132C0A6FA1217F864711AB49B00087D4B052E90E6BDD1A2E608AFA 83FC46AA956045E9CB96DAC922E1A96202B031A20BADB7F08C16BC11F436B4C2 27CD8C81CF5416D840AB419E23E44E21D98B4F5BD1C5765E0272C574839D4961 1AED45D3006101869D3EB60C5DFD574F6196539357360D5D4E315BD94D06DB51 8F1AF9DDE4E5DDF8428A0EC22F1255C607A059D30A33B85B87431BA08373CC3B 3E6931B8F048C9B1015E4AA5194825D9086EA9683D26C31C4F06FBEFCE148F89 63F74B03D74A178D479A4AABD544C909D31A6B625B8151BB95D842B81FF23491 7D485DEE3E2A2F250EB539B84D7B88F7EEA564CB61B2B7889367C4D37F1127CF 88A74F885D36B0A4370C5C3E23AC6BD1E6ACF7366E72CD6B03E39C4AA9CCA46D 6CE17F2A2E7B00B90F1F60A944E12E3908F17192E96A36E48D108F751EC90BF6 8F16C9DF16EE6B3F7D3B521F70DF2AC16469AAFD5E44946B560AEBAB0AD2A11A B85928C8D5D25E5671C158EDC358E058D0BC82D9A6B3DDCA066E1F245FE0CC18 58B799901607D1EC63E14D6CB5E37E5CD653D4F378F066CDF31B98CDECDC6186 1E8BE71F6F094B6B27B44766037F6FC2DDBC835F628567C09BD8D3B005062FF4 F584F6B4AF95CDDF0278C2B3E00F1902F8BEA3060000 }
>> probe compress read %"/c/Rebol/Values/Arguments.r"
#{ 789CAD91CD4EEB301085D7F6534C2B5DB181A4E2B2CA8ABF56085D7E54215854 5D4C936962D5B52B7B4208573C106FC938F752780096F3CD19CF9CE3859ED3CA 5B5868758B5B2AE0E02CD4ED961C47AD1E0C5B41E33D1A6B353309FDFA625910 7AD672E34392C235DC6060E384DE758E06F8CA543A6F7DDD0B9D9BBAE128F8C2 EFFA900A7887E3C9E4377C0D1FC2B7994C86E8068D2D6038F6D487D2BBCC1167 EE55AB275A15D030EF8A3CEFBA2E0B499319CE1FD15AEA73F1812BD977A2D57D 1B763E264B73E236B808DC10E0A715F0EB01AC5B57B2F10E3002C2CAFA72935A 9D0FD5089ED1B614D34D7FD0D52DD629B4A9ABAD898D56D317DCEE52427FB552 FB8CA0F896AA7A93BCCA8DF39DA5AAA68116E903D4F83C10335CA1ABE470D9A0 965A5D22CB73C727F9756BFB3CE5A4D5238528F709CE26D9448B4CEF1714C3F9 E9BD1F31398305BAFEE8533D5AC2F841A667FF6B512CD3AE86B082D2120610F9 E0666D5C056B132243318340520F6647D263F9CD7DEF9F4FF1B0D41F0CDBCED2 8F020000 }
> Thank you for the nice tool.
Thank you for trying it out and helping to make it better! Andrew J Martin Attendance Officer & Information Systems Trouble Shooter Colenso High School Arnold Street, Napier. Tel: 64-6-8310180 ext 826 Fax: 64-6-8336759 http://colenso.net/scripts/Wiki.r?AJM http://www.colenso.school.nz/ DISCLAIMER: Colenso High School and its Board of Trustees is not responsible (or legally liable) for materials distributed to or acquired from user e-mail accounts. You can report any misuse of an e-mail account to our ICT Manager and the complaint will be investigated. (Misuse can come in many forms, but can be viewed as any material sent/received that indicate or suggest pornography, unethical or illegal solicitation, racism, sexism, inappropriate language and/or other issues described in our Acceptable Use Policy.) All outgoing messages are certified virus-free by McAfee GroupShield Exchange 5.10.285.0 Phone: +64 6 843 5095 or Fax: +64 6 833 6759 or E-mail: [postmaster--colenso--school--nz]

 [24/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 8:11


On Tue, 5 Aug 2003 22:04:47 +1200, "A J Martin" <[AJMartin--orcon--net--nz]> said:
> Thanks, Bryan and Will! > Bryan wrote:
<<quoted lines omitted: 12>>
> from > MS Excel 2002):
As I said, I would still recommend building a DOM implementation over-top of xml-parse or some other xml-parser implementation. There were (maybe still are?) very real and significant shortcomings in REBOL's built-in parser, and so I'd recommend that you need a better parser implementation underneath your DOM. That said, my xml-parse implementation is slower than REBOL's -- hey I'm not the world's best REBOL developer and I basically brute-force translated the EBNF grammar productions from the XML spec into REBOL's most-excellent parse capability.
> XML: {<?xml version="1.0"?> > <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
<<quoted lines omitted: 23>>
> And set it with Rebol script like: > XML/Workbook/DocumentProperties/Author: "Andrew Martin"
The xml-object script will let you do that. Check out the web-archived docs at: http://web.archive.org/web/20020210063622/www3.sympatico.ca/gavin.mckenzie/rebol/xml-object-info.html
> Also we should think about several tags at the same level of nesting, > like
<<quoted lines omitted: 8>>
> attribute in the "DocumentProperties" tag? > XML/Workbook/DocumentProperties/________
I haven't found that having a different syntax for addressing attributes to be helpful. I just consider the attribute to be a child of the element. Yes, in theory it is possible to have an attribute and a child-element of the same name, but in practice I've never seen such an XML file in five+ years of working with XML. Or, at least not within the same namespace.
> [...]
Gavin.

 [25/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 8:51


On Wed, 6 Aug 2003 15:03:10 +0200, "bryan" <[bry--itnisk--com]> said:
> >yeah I thought there might be problems, another problem is in the rare > >occurrence of namespace prefixed attributes.
<<quoted lines omitted: 7>>
> </doc> > note that the xml namespace does not need to be declared anywhere.
Yeah, well the xml: namespace is special. It is implied. Though, there is acually a formal namespace URI for the xml namespace, so it *is* possible for it to be declared in an XML file. The xml namespace is only used for xml:lang and xml:space though. It is illegal for people to use this namespace other than for xml:lang and xml:space. Besides the xml namespace, others will be declared. Namespaces are tricky to implement in a parser/dom. The following snippets of XML are equivalent, though their syntax differs greatly: <a:a xmlns:a="whatever"> <b xmlns="something-else">text<b> </a:a> <a xmlns="whatever"> <b:b xmlns:b="something-else">text<b:b> </a> <a:a xmlns:a="whatever"> <a:b xmlns:a="something-else">text<a:b> </a:a> Note the nasty third example where the namespace prefix 'a' is redeclared locally for the element scope of 'b'. In short, for a namespace implementation the prefix: part of a namespace is almost entirely useless. It would be incorrect or at least very dangerous to build code that path-ed into an XML with an assumption about the prefixes: e.g. it would be bad to write a path like xml/a:a/b:b because the prefixes a: and b: are only present in one of the several ways to encode the same XML above. Gavin.

 [26/31] from: bry:itnisk at: 8-Aug-2003 15:55


>I haven't found that having a different syntax for addressing
attributes
>to be helpful. I just consider the attribute to be a child of the >element. Yes, in theory it is possible to have an attribute and a >child-element of the same name, but in practice I've never seen such an >XML file in five+ years of working with XML. Or, at least not within
the
>same namespace.
Unfortunately I have to be able to differentiate between them. I'm also more likely to handle more document-like structures, bibliographic materials, TEI, Docbook, xhtml etc, as such structures are less regular they increase paranoia. So basically the following: Xpath: "/body/a[@href and not(@class)]" Is no more helpful than a path which treats attributes as elements? This may be occasioned by my being set in my ways but I find it hard to believe I wouldn't feel the lack almost immediately, and have that lack grow even more annoying as time went on.

 [27/31] from: bry:itnisk at: 8-Aug-2003 16:05


>Yeah, well the xml: namespace is special. It is implied. Though,
there
>is acually a formal namespace URI for the xml namespace, so it *is* >possible for it to be declared in an XML file.
Right, hence I noted it need not be declared, not that it is never declared ;)
> The xml namespace is only >used for xml:lang and xml:space though. It is illegal for people to
use
>this namespace other than for xml:lang and xml:space.
Well it's reserved, so later versions may have more in that namespace.
>it would be bad to write a path like xml/a:a/b:b >because the prefixes a: and b: are only present in one of the several >ways to encode the same XML above.
Hence the need to bind a namespace declaration to a namespace prefix, also the need to differentiate between name() and local-name() in xpath or nodeName and baseName in DOM.

 [28/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 9:25


On Fri, 8 Aug 2003 15:55:02 +0200, "bryan" <[bry--itnisk--com]> said:
> >I haven't found that having a different syntax for addressing > attributes
<<quoted lines omitted: 8>>
> materials, TEI, Docbook, xhtml etc, as such structures are less regular > they increase paranoia.
Ahh...yes, so indeed you've fallen outside the goal of xml-object which is not intended to be used for document-oriented XML. I've just started working with DocBook myself lately.
> [...]
A real-live XPath implementation in REBOL would be useful, though *waayyy* non-trivial to build. Or, at least, building a compliant XPath implementation is non-trivial. A few years ago I contracted out to a company (Ginger Alliance in the Czech Republic (though I'm in Canada), see www.gingerall.com...great bunch of folks to work with) to build an XPath processor. They had lots of experience with XPath and XSLT and still it was a multi-week full-time-developer effort to get the XPath functionality working properly. Gavin.

 [29/31] from: rebol:gavinmckenzie:fastmail:fm at: 8-Aug-2003 9:28


> [...] > >it would be bad to write a path like xml/a:a/b:b
<<quoted lines omitted: 3>>
> also the need to differentiate between name() and local-name() in xpath > or nodeName and baseName in DOM.
Ahh, you've clearly demonstrated that know your XML. Gavin.

 [30/31] from: bry:itnisk at: 11-Aug-2003 11:56


>A real-live XPath implementation in REBOL would be useful, though >*waayyy* non-trivial to build.
I think difficulty however can be lessened by how the object is structured. I've been thinking that it needs to be one more level of abstraction, that is to say that each step in the object tells you what elements, attributes, namespaces, element name ,defaultNamespace, and textnodes are contained at that step. With textnodes referenced inside the elements block so that one can maintain their position. Child1: make object![ Name: "h:p" Elements: [t1 "h:p" "a" t2 "n:p"] Attributes: ["class" "style"] Namespaces: ["xmlns:h='http://www.w3.org/1999/xhtml'"] defaultNamespace: "http://www.w3.org/1999/xhtml" att1: "ParaStyle1" att2: "position:relative;" t1: "here's some text" t2: "hi" Child1: make object![ Name: "h:p" .... and so on and so on ] ] It is definitely not as elegant as just using Rebol path to navigate to /p/p but I think the abstraction offered here is necessary in order to handle all the possible weird textnode, namespaceing issues.

 [31/31] from: bry:itnisk at: 18-Aug-2003 13:26


>>A real-live XPath implementation in REBOL would be useful, though >>*waayyy* non-trivial to build.
<<quoted lines omitted: 4>>
>textnodes are contained at that step. With textnodes referenced inside >the elements block so that one can maintain their position.
Part of the reason for my thinking the REBOL XML engine should work as stated above is that this is similar to how xmerl handles it http://www.creado.com/xmerl_xs/userguide.html

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted