Mailing List Archive: Re: Parse This

[REBOL] Re: Parse This

From: brett:codeconscious at: 12-Feb-2002 17:12


Hi,

> Please note the curly-thingy before 37822. (sorry, dont know what it's
called in english)

Curly Brace.  or Curly Bracket.  Curly thingy works.. :)

> Jeff says here that it should be trivial to parse the string. It's not
trivial for me! I have no clue what so ever as how to parse the string, so i
can sort it by the subject.

Looking at your output, and mine, the pattern seems to be that each message
is on a seperate line. The first thing on the line is the article number
followed by a space.

The line termination appears to be a simple NEWLINE.

So here's my strategy, copy everything up to the space - thats the article
number, skip the space, copy everything to the end of line marker skip the
end of line marker. Repeat for as many lines as necessary. I'm assuming
there is not a end of line marker at the very start and that there is at the
very end. If this is wrong, you will need to change the rules. I need to use
the
/all refinement on parse otherwise, by default, parse will skip whitespace.

    parse/all first xresult [
        (messages: copy [])
        any [
            copy txt to #" "
            skip
            (insert tail messages txt)
            copy txt to newline
            skip
            (insert tail messages txt)
        ]
    ]

This results in a block of strings, article number, then message id, article
number, message id, etc...

To get all the article numbers

    extract messages 2

To get all the message ids

    extract next messages 2

> I have read the entire chapter on parsing in the core-manual serveral
times without getting any wiser.

I've made a parse tutorial which I hope complements the offical doco - you
can find it at:

http://www.codeconscious.com/rebol under articles.

Regards,
Brett.