Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Navigation in output produced by parse-xml

From: joel:neely:fedex at: 25-May-2001 12:38

Hi, James! Welcome! James Carlyle wrote:
> I want to parse an XML document, find an element, and then > find some ancestors of that element. Using parse-xml, I can > get a structure of blocks. Using the Ryan's 'deep-find' > example in other messages, I can traverse the block structure > looking for a particular block. But how can I identify the > parent of a block, for example? >
See below for one way.
> Do blocks have the concept of a parent when nested in other > blocks, or should I store the ancestry of blocks as I recurse > into a block structure using deep-find? >
I wouldn't use DEEP-FIND, but rather a function that "knows" about the structure of blocks coming from PARSE-XML. Here's one way to do it, in a fairly generic fashion: 8<------------------------------------------------------------ walkxml: func [ xmlb [block!] sel [any-function!] doer [any-function!] /local _walk parents ][ parents: copy [] _walk: func [xel [block!]] [ insert parents xel if do [sel parents] [do [doer parents]] if found? third xel [ foreach item third xel [ if block? item [_walk item] ] ] remove parents ] _walk first third xmlb ] 8<------------------------------------------------------------ The above function takes three arguments: a block created by XML-PARSER, a selection function that looks at a stack of element blocks (current, parent, grandparent, etc.), and a doer function that takes some action based on the same sequence. Remember that the first item in an XML-block is the tag name, the second is the attribute list (name/value pairs) and the third is a block containing all of the elements content. Suppose I have an XML document that looks like this: 8<------------------------------------------------------------ <sample> <table> <row> <cell>A</cell> <cell>B</cell> <cell>C</cell> </row> <row> <cell>D</cell> <cell>E</cell> <cell>F</cell> </row> </table> <table> <column> <cell>A</cell> <cell>D</cell> </column> <column> <cell>B</cell> <cell>E</cell> </column> <column> <cell>C</cell> <cell>F</cell> </column> </table> </sample> 8<------------------------------------------------------------ We can demo WALKXML by just showing the order that it visits the components of that structure, after saying data: parse-xml read %sample.xml A simple selector that accepts anything, and doer that just shows the tag name, look and work like this: any?: func [estack [block!]] [true] print-first: func [estack [block!]] [ print first first estack ]
>> walkxml data :any? :print-first
sample table row cell cell cell row cell cell cell table column cell cell column cell cell column cell cell (If you think there's an extra FIRST in PRINT-FIRST, remember that ESTACK contains all the parents from the current element upward.) A doer that indents each tag name: indent-name: func [estack [block!]] [ loop length? estack [prin " "] print first first estack ]
>> walkxml data :any? :indent-name
sample table row cell cell cell row cell cell cell table column cell cell column cell cell column cell cell A doer that shows the full "path" for each element looks like this: print-path: func [estack [block!] /local str] [ str: copy "" foreach item estack [ insert str rejoin ["/" first item] ] print str ]
>> walkxml data :any? :print-path
/sample /sample/table /sample/table/row /sample/table/row/cell /sample/table/row/cell /sample/table/row/cell /sample/table/row /sample/table/row/cell /sample/table/row/cell /sample/table/row/cell /sample/table /sample/table/column /sample/table/column/cell /sample/table/column/cell /sample/table/column /sample/table/column/cell /sample/table/column/cell /sample/table/column /sample/table/column/cell /sample/table/column/cell Now let's tackle something that requires more selection criteria. To show only the content of the CELL tags: cell?: func [estack [block!]] [ "cell" = first first estack ] print-content: func [estack [block!]] [ print third first estack ]
>> walkxml data :cell? :print-content
A B C D E F A D B E C F But now, suppose I only want to print the content of CELL tags that are inside ROW tags: cell-in-row?: func [estack [block!]] [ all [ 2 <= length? estack "cell" = first first estack "row" = first second estack ] ]
>> walkxml data :cell-in-row? :print-content
A B C D E F With the generic navigation provided by WALKXML, you can apply any test to the current block and its ancestry, and you can perform any operation on the current block and its ancestry. Hope this helps! -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com