[REBOL] regex and robots.txt
From: hallvard:ystad:oops-as:no at: 3-Aug-2004 13:54
I'm thinking about implementing a sort of robots.txt version 2 with regex for the rix
robot. I need to give certain directions of my own for a number of sites, and I wish
to use robots.txt files to feed to the bot. Only with regular expressions in them.
I've been looking for others' work on this - surely, someone must have thought about
using regular expressions in robots.txt files! I found documents like http://www.w3.org/Search/9605-Indexing-Workshop/Papers/[Frumkin--Excite--html],
but nothing very specific. Does anyone know about efforts that have already been done
or work that I should see before going deeper into this?
And about regular expressions: is there a rebol script that handles them somewhere, or
do I have to write my own? I searched rebol.org, but didn't find anything. (And I couldn't
search with rix, because the db has grown too big for my disc. So for the moment, the
database is down, and rix doesn't work. I either have to buy some more disc space for
the server or I have to look into removing doubles from the index pretty quickly!)
Any thoughts or tips on these subjects will be much appreciated. Thanks.