Parse This
[1/3] from: hanserik::oakseeker::org at: 12-Feb-2002 5:56
Hiyas all:-)))
I have just downloaded REBOL/view and have startet playing with nntp.r and have stumpled
into some problems.
I have no problems connection to the news-server and getting the subject-lines from the
news-group i'm interrested in. But here comes the problem. I receive them in one huge
block:
>> xresult: insert np [xhdr ["subject " count/2 "-" count/3] from "dk.historie.genealogi"]
== [{37822 Re: Windows95 reinstallation
37823 =?iso-8859-1?Q?S=F8ger?= Volhaus.
37824 Re: DIS-Danmarks formand tavs...
37825 Re: DI...
Please note the curly-thingy before 37822. (sorry, dont know what it's called in english)
Doing a search in the REBOL-list archive on xhdr i came across this from Jeff:
---quote
np: open news://news.somewhere
set [total start end] insert np [count from "alt.test"]
x-mids: rejoin ["Message-ID " start "-" end]
message-ids: insert np [xhdr x-mids from "alt.test" please]
;- please is optional :)
The XHDR command gives you back a big string in a block.
Yes, that is a little odd (XHDR was added at the last
minute just to help aspiring news bot writers, if you want
to know!). The string you will find in the block has the
number of each article followed by the message id. It's
trivial to parse the string and it'll allow you to ask for
individual articles by their message-ids in order. There's
examples of getting articles by message id in the NNTP.r
howto. Using XHDR, you'll have an efficient way of
obtaining true newsgroup ordering with no gaps (for news
severs that support the feature ... If they don't well, you
probably have to fall back on getting all the message
headers in a group if you want to insure total
ordering... that's what Forte' free agent does!!).
---unquote
Jeff says here that it should be trivial to parse the string. It's not trivial for me!
I have no clue what so ever as how to parse the string, so i can sort it by the subject.
I have read the entire chapter on parsing in the core-manual serveral times without getting
any wiser.
Is there a kind soul out there who can help me out please?
Hans-Erik
[2/3] from: brett:codeconscious at: 12-Feb-2002 17:12
Hi,
> Please note the curly-thingy before 37822. (sorry, dont know what it's
called in english)
Curly Brace. or Curly Bracket. Curly thingy works.. :)
> Jeff says here that it should be trivial to parse the string. It's not
trivial for me! I have no clue what so ever as how to parse the string, so i
can sort it by the subject.
Looking at your output, and mine, the pattern seems to be that each message
is on a seperate line. The first thing on the line is the article number
followed by a space.
The line termination appears to be a simple NEWLINE.
So here's my strategy, copy everything up to the space - thats the article
number, skip the space, copy everything to the end of line marker skip the
end of line marker. Repeat for as many lines as necessary. I'm assuming
there is not a end of line marker at the very start and that there is at the
very end. If this is wrong, you will need to change the rules. I need to use
the
/all refinement on parse otherwise, by default, parse will skip whitespace.
parse/all first xresult [
(messages: copy [])
any [
copy txt to #" "
skip
(insert tail messages txt)
copy txt to newline
skip
(insert tail messages txt)
]
]
This results in a block of strings, article number, then message id, article
number, message id, etc...
To get all the article numbers
extract messages 2
To get all the message ids
extract next messages 2
> I have read the entire chapter on parsing in the core-manual serveral
times without getting any wiser.
I've made a parse tutorial which I hope complements the offical doco - you
can find it at:
http://www.codeconscious.com/rebol under articles.
Regards,
Brett.
[3/3] from: joel:neely:fedex at: 12-Feb-2002 7:19
Hi, Hans-Erik,
Hans-Erik wrote:
> I have no problems connection to the news-server and getting
> the subject-lines from the news-group i'm interrested in.
<<quoted lines omitted: 4>>
> 37824 Re: DIS-Danmarks formand tavs...
> 37825 Re: DI...
...
> Jeff says here that it should be trivial to parse the string.
>
Hope this helps!
>> blort: {this is a very long string
{ broken into several lines
{ delimited by only a newline
{ which can be parsed easily
{ with the following trick:}
== {this is a very long string
broken into several lines
delimited by only a newline
which can be parsed easily
with the following ...
>> parse/all blort "^/"
== ["this is a very long string" "broken into several lines"
"delimited by only a newline" "which can be parsed easily"
"with the f...
>> nbr: 0 foreach element parse/all blort "^/" [
[ nbr: nbr + 1
[ print [nbr element]
[ ]
1 this is a very long string
2 broken into several lines
3 delimited by only a newline
4 which can be parsed easily
5 with the following trick:
-jn-
--
; sub REBOL {}; sub head ($) {@_[0]}
REBOL []
# despam: func [e] [replace replace/all e ":" "." "#" "@"]
; sub despam {my ($e) = @_; $e =~ tr/:#/.@/; return "\n$e"}
print head reverse despam "moc:xedef#yleen:leoj" ;
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted