Parse This

[1/3] from: hanserik::oakseeker::org at: 12-Feb-2002 5:56

Hiyas all:-))) I have just downloaded REBOL/view and have startet playing with nntp.r and have stumpled into some problems. I have no problems connection to the news-server and getting the subject-lines from the news-group i'm interrested in. But here comes the problem. I receive them in one huge block:

>> xresult: insert np [xhdr ["subject " count/2 "-" count/3] from "dk.historie.genealogi"]

== [{37822 Re: Windows95 reinstallation 37823 =?iso-8859-1?Q?S=F8ger?= Volhaus. 37824 Re: DIS-Danmarks formand tavs... 37825 Re: DI... Please note the curly-thingy before 37822. (sorry, dont know what it's called in english) Doing a search in the REBOL-list archive on xhdr i came across this from Jeff: ---quote np: open news://news.somewhere set [total start end] insert np [count from "alt.test"] x-mids: rejoin ["Message-ID " start "-" end] message-ids: insert np [xhdr x-mids from "alt.test" please] ;- please is optional :) The XHDR command gives you back a big string in a block. Yes, that is a little odd (XHDR was added at the last minute just to help aspiring news bot writers, if you want to know!). The string you will find in the block has the number of each article followed by the message id. It's trivial to parse the string and it'll allow you to ask for individual articles by their message-ids in order. There's examples of getting articles by message id in the NNTP.r howto. Using XHDR, you'll have an efficient way of obtaining true newsgroup ordering with no gaps (for news severs that support the feature ... If they don't well, you probably have to fall back on getting all the message headers in a group if you want to insure total ordering... that's what Forte' free agent does!!). ---unquote Jeff says here that it should be trivial to parse the string. It's not trivial for me! I have no clue what so ever as how to parse the string, so i can sort it by the subject. I have read the entire chapter on parsing in the core-manual serveral times without getting any wiser. Is there a kind soul out there who can help me out please? Hans-Erik

[2/3] from: brett:codeconscious at: 12-Feb-2002 17:12

Hi,

> Please note the curly-thingy before 37822. (sorry, dont know what it's

called in english) Curly Brace. or Curly Bracket. Curly thingy works.. :)

> Jeff says here that it should be trivial to parse the string. It's not

trivial for me! I have no clue what so ever as how to parse the string, so i can sort it by the subject. Looking at your output, and mine, the pattern seems to be that each message is on a seperate line. The first thing on the line is the article number followed by a space. The line termination appears to be a simple NEWLINE. So here's my strategy, copy everything up to the space - thats the article number, skip the space, copy everything to the end of line marker skip the end of line marker. Repeat for as many lines as necessary. I'm assuming there is not a end of line marker at the very start and that there is at the very end. If this is wrong, you will need to change the rules. I need to use the /all refinement on parse otherwise, by default, parse will skip whitespace. parse/all first xresult [ (messages: copy []) any [ copy txt to #" " skip (insert tail messages txt) copy txt to newline skip (insert tail messages txt) ] ] This results in a block of strings, article number, then message id, article number, message id, etc... To get all the article numbers extract messages 2 To get all the message ids extract next messages 2

> I have read the entire chapter on parsing in the core-manual serveral

times without getting any wiser. I've made a parse tutorial which I hope complements the offical doco - you can find it at: http://www.codeconscious.com/rebol under articles. Regards, Brett.

[3/3] from: joel:neely:fedex at: 12-Feb-2002 7:19

Hi, Hans-Erik, Hans-Erik wrote:

> I have no problems connection to the news-server and getting > the subject-lines from the news-group i'm interrested in.

<<quoted lines omitted: 4>>

> 37824 Re: DIS-Danmarks formand tavs... > 37825 Re: DI...

...

> Jeff says here that it should be trivial to parse the string. >

Hope this helps!

>> blort: {this is a very long string

{ broken into several lines { delimited by only a newline { which can be parsed easily { with the following trick:} == {this is a very long string broken into several lines delimited by only a newline which can be parsed easily with the following ...

>> parse/all blort "^/"

== ["this is a very long string" "broken into several lines" "delimited by only a newline" "which can be parsed easily" "with the f...

>> nbr: 0 foreach element parse/all blort "^/" [

[ nbr: nbr + 1 [ print [nbr element] [ ] 1 this is a very long string 2 broken into several lines 3 delimited by only a newline 4 which can be parsed easily 5 with the following trick: -jn- -- ; sub REBOL {}; sub head ($) {@_[0]} REBOL [] # despam: func [e] [replace replace/all e ":" "." "#" "@"] ; sub despam {my ($e) = @_; $e =~ tr/:#/.@/; return "\n$e"} print head reverse despam "moc:xedef#yleen:leoj" ;

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted