[REBOL] Problem with newsgroup message count Re:(2)
From: jeff:rebol at: 1-Sep-2000 17:07
> > When requesting number of messages for newsgroup
> > fr.misc.finance we got 1000000 ! although there are less
> > than 3000 messages. How can we cope with that ?
>
> The built-in nntp protocol (nntp://) appears to always
> return 1000000 as the length of the group.
>
> The external news:// protocol ('do %nntp.r) has a count
> command which looks like it correctly obtains article
> counters from the news server. However, I can't figure out
> the proper way to issue the count command via news://
> ... I'm copying this to Jeff directly (as the all-knowing
> master of the news:// protocol) to see if he can explain
> it.
>
> Cheers, Kev
yo:
The problem is news servers lie. One of the aspects of the
built in nntp handler is that it is supposed to provide a
series metaphor for a newsgroup, but we can't be assured of
total ordering of the messages there (many may be missing
from the total that is returned). This is as the nntp spec
goes-- since articles may be expiring at the time you
inquire. The count data returned from nntp servers isn't
supposed to be reliable, so the series metaphor is kind of
weak in this case (especially compared to pop:// for
instance). 1000000 was just an arbitrary large number for
the length of the port since it is indeterminable. Do you
think it would be better to have whatever the server said
about the message count there?
Anyhow, here's a post I made a while back that talks about
this issue and what someone might do to guarantee message
ordering, at least with nntp.r:
--------------------------------------------
NNTP servers lie. They tell you a range of numbers of
articles that MAY be there. They may well not be there,
though. It's a funny thing.
The only way to really get the true number of articles in a
newsgroup is to get the headers for each article (which over
a 22kb modem 'aint always a good idea).
NNTP.r, as it is, also provides the optional NNTP "XHDR"
command which lets you quickly download just the subject
lines (or any other given header field: from, to, keywords,
etc..) of all the headers in the group. Having all the
subject lines, you can then know for sure (unless some of
those articles expire while you are reading) the count of
articles in a newsgroup.
One of the things NNTP does when it connects to a news
server is determine if it can do XHDR. Interactively you
can ask an open news port what it can do by inserting [help]
into the port. Here's how you can determine
non-interactively if the server you are talking to has XHDR:
np: open news://news.somewhere
found? find np/handler/commands 'xhdr
Using XHDR you can do something like the following:
np: open news://news.somewhere
set [total start end] insert np [count from "alt.test"]
x-mids: rejoin ["Message-ID " start "-" end]
message-ids: insert np [xhdr x-mids from "alt.test" please]
;- please is optional :)
The XHDR command gives you back a big string in a block.
Yes, that is a little odd (XHDR was added at the last
minute just to help aspiring news bot writers, if you want
to know!). The string you will find in the block has the
number of each article followed by the message id. It's
trivial to parse the string and it'll allow you to ask for
individual articles by their message-ids in order. There's
examples of getting articles by message id in the NNTP.r
howto. Using XHDR, you'll have an efficient way of
obtaining true newsgroup ordering with no gaps (for news
severs that support the feature ... If they don't well, you
probably have to fall back on getting all the message
headers in a group if you want to insure total
ordering... that's what Forte' free agent does!!).
Boy, nntp.r really is in need of an update. Looking at
NNTP.r it's doing all sorts of dialecting things by hand
that would be a lot easier to do today in modern REBOL with
things like parse block. The whole thing could be shrunk
by at least a half. So many things rattling around on our
overly loaded wagon trains...
[still need to get to that... arg.. too much stuff.. head
caving in... must.. alert the others... desk jobs sometimes
aren't "cushy".. but can be fun anyhow]