[REBOL] Re: Need some url purification functions
From: g:santilli:tiscalinet:it at: 11-Jan-2001 19:13
Hello Gunjan!
On 10-Gen-01, you wrote:
GK> The problem arises when my crawler extracts urls like /demo
GK> or ../demo or../../demo or /demo/index.html etc i.e. when a
GK> relative path is used instead of an absolute path. Then my
GK> poor crawlers gets absolutely confused and does not know what
GK> to do.
Some hints that should help you start out:
>> url-obj: make object! [user: pass: host: port-id: path: target: none]
>> net-utils/url-parser/parse-url url-obj http://www.yahoo.com/temp.html
== "temp.html"
>> print mold url-obj
make object! [
user: none
pass: none
host: "www.yahoo.com"
port-id: none
path: none
target: "temp.html"
]
>> net-utils/url-parser/parse-url url-obj http://www.yahoo.com/dir/temp.html
== "temp.html"
>> print mold url-obj
make object! [
user: none
pass: none
host: "www.yahoo.com"
port-id: none
path: "dir/"
target: "temp.html"
]
>> clean-path join %/ [url-obj/path %./demo]
== %/dir/demo
>> clean-path join %/ [url-obj/path %../demo]
== %/demo
HTH,
Gabriele.
--
Gabriele Santilli <[giesse--writeme--com]> - Amigan - REBOL programmer
Amiga Group Italia sez. L'Aquila -- http://www.amyresource.it/AGI/