[REBOL] Re: Rebol web presence statistics
From: carl:cybercraft at: 19-Mar-2004 20:47
On 19-Mar-04, Hallvard Ystad wrote:
> Hi list
> A quick tour around the search engines reveals they find this many
> documents about "rebol": Google: "about 188,000" Altavista: 16,721
> Alltheweb: 51,983
> Hotbot: 14,131
> Teoma: "about 19,100"
> msn: "about 13635"
> Yahoo: "about 78,300"
> As I write, the RIXbot has 45621 documents in its index. These are
> documents that contain "rebol" in ANY way, so putting <rebol> (as an
> html tag) in a web page, or linking to rebol.com, will cause the
> page to be included in the index. It is intended to work this way.
> This means some pages will not have the word "rebol" on them
> (visibly), but still be indexed.
> The last time I spoke about the RIX on this list, someone suggested
> I make it possible to search through rebol headers. This is now
> done. The bot has several indexes, both in full text and in rebol
> headers. E.g., you can see some of Carl Sassenrath's and Carl Read's
> scripts here: http://www.oops-as.no/rix?q=carl&st=sauthor
Hey - someone did a search for me! ;-)
> Do we need this? I think maybe not.
Well I think so, as the more ways to search for REBOL info the better.
One suggestion for the results: I'd like to see the URL's shown too,
as they provide extra info not given by the webpages' headers. Three
or four links all just saying "REBOL.org Script Library" are not that
helpful. Yes, we can put the mouse-pointer over them, but that's not
> Then why make it? Because rebol
> is fun and a bit too addictive. I really hope I will reach some
> stage that I find satisfactory with this, so I can leave it behind
> and get some sleep...
> There are duplicates in the database: http://rebol.com/,
> http://www.rebol.com/, http://rebol.com/index.html and
> http://www.rebol.com/index.html are all registered. I'm working on a
> filter to get them out.
> Rebol scripts are detected with 'load. Web pages with more than one
> script are currently registered with the first script on the page
> only. This too will be changed if/when I find the time.
> If you're curious about whether or not some page is in the index,
> please use http://www.oops-as.no/rixaddurl to check. I hope this
> index can be more or less exhaustive, so I'm grateful to all who
> tell the bot where to go.
I just did...
RIX URL Submission
OK, your URL wasn't found in the database, so it was added to the
checklist. Thanks for submitting.
RIX works at a pace of 5000 site updates per night. There are
currently 233432 websites before you in the queue, so this URL
should be indexed around 5-May-2004. But keep in mind that this
is only an approximate suggestion.
It's going to be quite a busy little bot for the forseeable future,
isn't it? (-:
> So Google reports 188000 pages... But clicking "next" repeatedly
> never gets you to the end. I wonder if this figure is really real...
> Thanks for all the help I have gotten from this list, and thanks to
> Nenad for the mysql protocol in particular.
And thanks for RIX Hallvard - it's a useful tool.
> Pr=E6tera censeo Carthaginem esse delendam