[ANN] (but what should I call it?)
[1/6] from: hallvard:ystad:helpinhand at: 11-May-2003 12:18
Hi folks
We don't really need this - there are enough search engines for the web as it is. Still
- I made a little robot that searches the web for pages containing the word "rebol".
(It was fun making this, but I do believe that a google search including +rebol will
yield better results.) Pages lacking the magical word are not indexed.
You can try it on http://folk.uio.no/hallvary/rebindex/
At the moment, anything you write in the search box will be treated as a string, and
the search engine will look for that exact string. I'll change this soon.
I called this a "rebindexer", but I'd very much like name suggestions! Thanks.
If the robot hasn't indexed your page, but you'd like it to, just send me a note, and
I'll fix it.
I haven't got any /pro licence and cannot access a database from rebol, so this application
stores all indexed pages (compressed) in a textfile. It runs with the newest rebol/core
on a mac OSX. I don't know how it will scale, so if you guys all try it at the same time,
we will see. This will be an interresting experience...
Enjoy,
~H
[2/6] from: hallvard:ystad:helpinhand at: 11-May-2003 15:30
Dixit Andreas Bolka (14.20 11.05.2003):
>does it obey robots.txt? if yes, what is its agend id? if no, why not?
It reads robots.txt, but doesn't obey yet. I plan to of course, but haven't gotten that
far. For the moment, it excludes anything that smells like dynamically created pages
(cgi-bin, .php, .r, .asp, .jsp, .pl, .cfm ...) I just couldn't resist posting this now,
even though some work is yet to be done with it...
Agent ID? I haven't decided on its name yet. Propositions are welcome.
~H
[3/6] from: andreas:bolka:gmx at: 11-May-2003 13:20
Sunday, May 11, 2003, 11:18:30 AM, Hallvard wrote:
> We don't really need this - there are enough search engines for the
> web as it is. Still - I made a little robot that searches the web
> for pages containing the word "rebol".
does it obey robots.txt? if yes, what is its agend id? if no, why not?
(see eg. http://www.searchengineworld.com/robots/robots_tutorial.htm
or http://www.robotstxt.org/wc/norobots.html)
--
Best regards,
Andreas mailto:[andreas--bolka--gmx--net]
[4/6] from: ingo::2b1::de at: 11-May-2003 17:12
Hi Hallvard,
Hallvard Ystad wrote:
<..>
> For the moment, it excludes anything that smells like dynamically
> created pages (cgi-bin, .php, .r, .asp, .jsp, .pl, .cfm ...) I just
> couldn't resist posting this now, even though some work is yet to be
> done with it...
To make it really useful try to read *.r files first, and index them if
they contain a REBOL[] header. That's something we _could_ need.
Kind regards,
Ingo
[5/6] from: ingo:2b1 at: 11-May-2003 17:56
Hi Hallvard,
Hallvard Ystad wrote:
<..>
> I haven't got any /pro licence and cannot access a database from
> rebol,
For MySQL access you don't need /Pro, DocKimbels /Core driver is
available through softinnov ( http://www.softinnov.com ) at
http://rebol.softinnov.org/mysql/
Kind regards,
Ingo
[6/6] from: hallvard:ystad:helpinhand at: 12-May-2003 11:20
Dixit Ingo Hohmann (17.56 11.05.2003):
>Hi Hallvard,
>For MySQL access you don't need /Pro, DocKimbels /Core driver is available through softinnov
( http://www.softinnov.com ) at
>http://rebol.softinnov.org/mysql/
Ingo, that's brilliant news! I'll look into it as soon as I can...
~H