[REBOL] Re: ANN: SiteCrawl
From: arolls:bigpond:au at: 25-Jul-2001 4:34
> rebol-pages: copy []
> SiteCrawl http://www.rebol.com rebol-pages
>
> I need feedback on this. Do you have a small site where you can test
> 'SiteCrawl for me?
> -Ryan
My site is fairly small, you can check it out easily:
http://users.bigpond.net.au/datababies/anton/index.html
I think not all links are written with surrounding
quotes, as assumed by your pageLinks function.
Maybe it's not the official way, but IE lets this through:
<a href=http://antonrolls.net>mysite</a>
Also, it doesn't catch a link such as this:
<a href="TechSupport/">Tech Support</a><br>
(as found in my site.)
without an index.html file specified.
I think it should look for:
TechSupport/index.htm(l)
TechSupport/default.htm(l)
It's interesting, if you
trace/net on
read http://users.bigpond.net.au/datababies/anton/TechSupport
trace/net off
you can see it tries first to find the file "TechSupport",
then it tries to get the directory "TechSupport/".
In your SiteCrawl function, where it is written:
if error? try [...][
links: next links
]
It seems as if you are relying on the error to
occur. An error occurs for all of the links in
my site.
And why do you write links: next links ? Surely
the next link will come along in the next iteration
of the foreach loop. I suggest just do nothing [].
Regards,
Anton.