Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] regex and robots.txt

From: hallvard:ystad:oops-as:no at: 3-Aug-2004 13:54

Hi folks, I'm thinking about implementing a sort of robots.txt version 2 with regex for the rix robot. I need to give certain directions of my own for a number of sites, and I wish to use robots.txt files to feed to the bot. Only with regular expressions in them. I've been looking for others' work on this - surely, someone must have thought about using regular expressions in robots.txt files! I found documents like http://www.w3.org/Search/9605-Indexing-Workshop/Papers/[Frumkin--Excite--html], but nothing very specific. Does anyone know about efforts that have already been done or work that I should see before going deeper into this? And about regular expressions: is there a rebol script that handles them somewhere, or do I have to write my own? I searched rebol.org, but didn't find anything. (And I couldn't search with rix, because the db has grown too big for my disc. So for the moment, the database is down, and rix doesn't work. I either have to buy some more disc space for the server or I have to look into removing doubles from the index pretty quickly!) Any thoughts or tips on these subjects will be much appreciated. Thanks. HY