[REBOL] Re: Rebol web presence statistics

From: carl:cybercraft at: 19-Mar-2004 20:47

On 19-Mar-04, Hallvard Ystad wrote:
> Hi list > A quick tour around the search engines reveals they find this many > documents about "rebol": Google: "about 188,000" Altavista: 16,721 > Alltheweb: 51,983 > Hotbot: 14,131 > Teoma: "about 19,100" > msn: "about 13635" > Yahoo: "about 78,300" > As I write, the RIXbot has 45621 documents in its index. These are > documents that contain "rebol" in ANY way, so putting <rebol> (as an > html tag) in a web page, or linking to, will cause the > page to be included in the index. It is intended to work this way. > This means some pages will not have the word "rebol" on them > (visibly), but still be indexed. > The last time I spoke about the RIX on this list, someone suggested > I make it possible to search through rebol headers. This is now > done. The bot has several indexes, both in full text and in rebol > headers. E.g., you can see some of Carl Sassenrath's and Carl Read's > scripts here:
Hey - someone did a search for me! ;-)
> Do we need this? I think maybe not.
Well I think so, as the more ways to search for REBOL info the better. One suggestion for the results: I'd like to see the URL's shown too, as they provide extra info not given by the webpages' headers. Three or four links all just saying " Script Library" are not that helpful. Yes, we can put the mouse-pointer over them, but that's not too friendly.
> Then why make it? Because rebol > is fun and a bit too addictive. I really hope I will reach some > stage that I find satisfactory with this, so I can leave it behind > and get some sleep... > There are duplicates in the database:, >, and > are all registered. I'm working on a > filter to get them out. > Rebol scripts are detected with 'load. Web pages with more than one > script are currently registered with the first script on the page > only. This too will be changed if/when I find the time. > If you're curious about whether or not some page is in the index, > please use to check. I hope this > index can be more or less exhaustive, so I'm grateful to all who > tell the bot where to go.
I just did... RIX URL Submission OK, your URL wasn't found in the database, so it was added to the checklist. Thanks for submitting. RIX works at a pace of 5000 site updates per night. There are currently 233432 websites before you in the queue, so this URL should be indexed around 5-May-2004. But keep in mind that this is only an approximate suggestion. It's going to be quite a busy little bot for the forseeable future, isn't it? (-:
> So Google reports 188000 pages... But clicking "next" repeatedly > never gets you to the end. I wonder if this figure is really real... > Thanks for all the help I have gotten from this list, and thanks to > Nenad for the mysql protocol in particular.
And thanks for RIX Hallvard - it's a useful tool.
-- Carl Read