Mailing List Archive: Re: Complex Series Parsing (Part 2)

[REBOL] Re: Complex Series Parsing (Part 2)

From: petr:krenzelok:trz:cz at: 9-Mar-2001 23:57


----- Original Message -----
From: <[sterling--rebol--com]>
To: <[rebol-list--rebol--com]>
Sent: Friday, March 09, 2001 8:03 PM
Subject: [REBOL] Re: Complex Series Parsing (Part 2)

> Well, before anybody goes further into the "here's something that
> works for the last input you posted" followed by "but then there's
> this input that doesn't work" path, lets go back to the definition of
> input and output.
>
> If you use load.markup and trat the REBOL words you have in your block
> as strings like Andrew suggests (which is a better way to deal with
> them), then you have these input elements:
> * <text>  -- open text tag
> * "???"   -- some arbitrary string
> * <???>   -- some other open tag
> * </???>  -- some close tag
> * </text> -- a close text tag
>
> Your input looks like this:
> probe input: load/markup {<tag0></tag0> <text> this and that
> <tag1>those </tag1>and > these</text><tag2></tag2><text>There and
> then</text>}
> == [<tag0> </tag0> " " <text> " this and that^/" <tag1> "those "
> </tag1> "and > these" </text> <tag2> </tag2> <text> "There and^/then"
> </text>]
>
> You can get rid of the whitespace-only strings if you want to that are
> created due to whitespace between the tags.

OK, is there easy way of how to do it without using iteration? If I will use
e.g. replace/all blk " " none, it will just replace whitespace with 'none,
but we want simply to remove the whitespace :-)

> Now write the spec:
> * any combination of input elements up to <text>
> * open <text>
> * any combination of "???", <???>, </???> where <text> whould be
> inserted if front of each "???"
> * </text>
> * start the whole process over
> Done.
>
> That's all you've told us so far.  Each item above is essentially a
> parse rule already.  Some can be joined together:
> * [thru <text>]
> * [any [
> </text> [thru <text>]
> | tag!
> | string! mark: (insert back mark <text>) string!
> ]
> ]
>
> Now we just assemble:
>  ; skip the immediate string after <text> so we don't add a second one
> start-rule: [thru <text> [string! | none]]
> parse imput [
> start-rule
> any [
> </text> start-rule ; start over
> | tag! ; eat any random tags
> | string! mark: (insert back mark <text>) string!
> ]
> ]
>

So you prefer to work with XML-like data in block mode rather than in string
mode?

Cheers,
-pekr-