Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Complex Series XML Parsing?

 [1/2] from: depotcity::home::com at: 7-Mar-2001 14:48


Hello all. What's the best approach to parsing out the data between xml tags in a series. eg: <thexml> <tag1> This is some string <tag2> with embedded tags </tag2> and another <tag3> with this string, </tag3> ending with this string.</tag1> </thexml> so that we get... thexml: [ tag1 "This is some string" tag2 "with embedded tags" tag1 "and another" tag3 "with this string" tag1 "ending with this string." ] Where: The tag is converted to a word and the contents into a string BUT... where the tag names are unknown, just that they are tags. Thanks Terry Brownell www.lfred.com

 [2/2] from: christian:ensel:gmx at: 8-Mar-2001 16:02


Hi Terry Brownell, on wednesday, 07-Mar-01, 23:48:09, Terry Brownell wrote on subject [REBOL] Complex Series XML Parsing? :
> so that we get... > thexml: [
<<quoted lines omitted: 4>>
> tag1 "ending with this string." > ]
The format you've choosen looks a little bit unusual to me, because I don't see why you drop the nesting information packed into the xml source. However, the following quick and dirty code may at least be a starting point ... assuming that you've only have to deal with valid xml (no prologue, no DTD etc.), and aren't interested in comments or processing-instructions. Out of your example it produces: document: [ thexml [ tag1 [ "This is some string" tag2 ["with embedded tags"] "and another" tag3 ["with this string"] "ending with this string." ] ] ] Regards, Christian -------------------------------------------------------------------------- REBOL [] xml: { <thexml> <tag1>This is some string <tag2>with embedded tags</tag2><empty/> and another <tag3>with this string,</tag3>ending with this string </tag1> </thexml> } container: document: [] append/only containers: [] container content-rule: [ any [copy a string! (a: trim/lines first a if not empty? a [append container a]) | copy a tag! (a: first a handle-tag a) ] ] no-space: complement charset " ^M^-^/" handle-tag: func [tag [tag!] /local name] [ parse/all tag [copy name any no-space to end] if equal? last element #"/" [ ;empty-tag if equal? last element #"/" [remove back tail name] append container to-word name return ] if equal? first name #"/" [ ;closing tag container: last containers: head remove back tail containers return ] if equal? first name #"?" [return] ;drop pi's if equal? first name #"!" [return] ;drop comments ;opening-tags repend container [to-word name container: copy []] append/only containers container ] xml: load/markup xml parse xml content-rule print mold document

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted