clean-path
[1/20] from: gchiu::compkarori::co::nz at: 4-Dec-2000 15:49
Hi,
What does clean-path do? I passed urls with .. and . in
them to it, and they are unchanged.
--
Graham Chiu
[2/20] from: timewarp:sirius at: 3-Dec-2000 19:13
Graham Chiu wrote:
> Hi,
>
> What does clean-path do? I passed urls with .. and . in
> them to it, and they are unchanged.
did you try assigning the returned value to a word?
[3/20] from: timewarp:sirius at: 3-Dec-2000 19:23
Graham Chiu wrote:
> Hi,
>
> What does clean-path do? I passed urls with .. and . in
> them to it, and they are unchanged.
hey, you know what? it really doesn't seem to work.
[4/20] from: al:bri:xtra at: 4-Dec-2000 16:30
> What does clean-path do? I passed urls with .. and . in them to it, and
they are unchanged.
>> source clean-path
clean-path: func [
{Cleans-up '.' and '..' in path; returns the cleaned path.}
target [file! url!]
/local item file path mark mark2 marks dots dot dotdot target-copy][
if url? target [return target]
...
Andrew Martin
I have no words to speak, but I must rebol...
ICQ: 26227169 http://members.nbci.com/AndrewMartin/
[5/20] from: arolls:bigpond:au at: 4-Dec-2000 14:49
You're right,
The following works (on local files):
>> what-dir
== %/D/Anton/Dev/Rebol/view/
>> clean-path %..
== %/D/Anton/Dev/Rebol/
The following doesn't work (on urls):
>> clean-path http://users.bigpond.net.au/datababies/anton/../
== http://users.bigpond.net.au/datababies/anton/../
>> clean-path http://users.bigpond.net.au/datababies/anton/..
== http://users.bigpond.net.au/datababies/anton/..
'clean-path claims that it accepts urls, so I guess the
feature was forgotten to be implemented.
You should report it to feedback, I reckon.
Anton.
[6/20] from: al:bri:xtra at: 4-Dec-2000 17:47
Anton wrote:
> 'clean-path claims that it accepts urls, so I guess the feature was
forgotten to be implemented.
> You should report it to feedback, I reckon.
I wouldn't. Where does this URL point to:
http://members.nbci.com/AndrewMartin/../
?
Could Rebol have guessed:
http://www.nbci.com/
?
YMMV, depending upon the site you access with ../.
Andrew Martin
ICQ: 26227169 http://members.nbci.com/AndrewMartin/
[7/20] from: gchiu:compkarori at: 4-Dec-2000 19:23
On Mon, 4 Dec 2000 17:47:11 +1300
"Andrew Martin" <[Al--Bri--xtra--co--nz]> wrote:
> Anton wrote:
> > 'clean-path claims that it accepts urls, so I guess the
<<quoted lines omitted: 8>>
> ?
> YMMV, depending upon the site you access with ../.
Or perhaps the version of Rebol one is using.
>> clean-path http://members.nbci.com/AndrewMartin/../
== http://members.nbci.com/AndrewMartin/../
>> rebol/version
== 0.10.38.3.1
--
Graham Chiu
[8/20] from: gchiu:compkarori at: 4-Dec-2000 19:32
On Mon, 4 Dec 2000 14:49:07 +1100
"Anton" <[arolls--bigpond--net--au]> wrote:
> 'clean-path claims that it accepts urls, so I guess the
> feature was forgotten to be implemented.
> You should report it to feedback, I reckon.
Hi Anton,
You're right. This is from the source
>> source clean-path
clean-path: func [
{Cleans-up '.' and '..' in path; returns the cleaned
path.}
target [file! url!]
/local item file path mark mark2 marks dots dot dotdot
target-copy][
if url? target [return target]
So, if the parameter is a url, it just returns it unchanged.
Guess it was just forgot about.
I just noticed this while trying to figure out why Bo's
webcrawler.r doesn't work.
I'll send it to the feedback bot.
--
Graham Chiu
[9/20] from: al:bri:xtra at: 4-Dec-2000 20:17
> > YMMV, depending upon the site you access with ../.
> >
>
> Or perhaps the version of Rebol one is using.
Check it out in a browser! :-)
Andrew Martin
ICQ: 26227169 http://members.nbci.com/AndrewMartin/
[10/20] from: arolls:bigpond:au at: 4-Dec-2000 19:43
> Anton wrote:
> > 'clean-path claims that it accepts urls, so I guess the feature was
<<quoted lines omitted: 6>>
> http://www.nbci.com/
> ?
Maybe it can.
If you look at a trace/net of
read http://members.nbci.com/AndrewMartin/../
you can see (many of):
Net-log: "HTTP/1.1 302 Found"
URL Parse: none none www.nbci.com none none none
Net-log: ["Opening tcp for" HTTP]
Net-log: {GET / HTTP/1.0
Accept: */*
Connection: close
User-Agent: REBOL 0.10.38.3.1
Host: members.nbci.com
Extracting the above www.nbci.com is a little bit more complicated
than just operating the same as local paths but we should at least
try, hey?
Anyway, local paths can lead to trouble too, can't they?
What about symbolic links and volumes mounted as /blah
in unix-land? eg. /usr/../../ Where does that take you?
(Is that a bad example?)
So I still think it should be reported.
This has been a code-free document.
Anton.
[11/20] from: g:santilli:tiscalinet:it at: 4-Dec-2000 13:02
Erin Thomas wrote:
> > What does clean-path do? I passed urls with .. and . in
> > them to it, and they are unchanged.
>
> hey, you know what? it really doesn't seem to work.
It works on files only. ../ are significant in URLS sometimes, so
CLEAN-PATH simply doesn't remove them. You could write your own
function that is smart enough to detect those situations...
Regards,
Gabriele.
--
Gabriele Santilli <[giesse--writeme--com]> - Amigan - REBOL programmer
Amiga Group Italia sez. L'Aquila -- http://www.amyresource.it/AGI/
[12/20] from: gchiu:compkarori at: 5-Dec-2000 7:33
On Mon, 04 Dec 2000 13:02:07 +0100
Gabriele Santilli <[g--santilli--tiscalinet--it]> wrote:
> It works on files only. ../ are significant in URLS
> sometimes, so
> CLEAN-PATH simply doesn't remove them. You could write
> your own
> function that is smart enough to detect those
> situations...
I noticed that Bo used it in his webcrawler.r, so I guess
someone at Rebol thought it worked on urls :-)
--
Graham Chiu
[13/20] from: sterling:rebol at: 4-Dec-2000 11:12
The main reason that URLs are not translated by clean-path is
basically that you do not really know what directory transaltion is in
effect on the contacted site. Take FTP as an example:
read ftp://user1:[pass--example--com]/readme.txt
This file will most likely reside in /home/user1/readme.txt.
Now lets say that user1 has shared their directory and this file for
others to read. So user2 wants to get it:
read ftp://user2:[pass2--example--com]/../user1/readme.txt
This will work because the "root" directory that is specified by
ftp://user2:[pass2--example--com]/ is actually in /home/user2/. The
problem with cleaning the URL is this:
clean-path ftp://user2:[pass2--example--com]/../user1/readme.txt would
come out as:
ftp://user2:[pass2--example--com]/user1/readme.txt
which is clearly not right.
Some FTP servers will not allow you to back out of your home directory
but some do. The point is that we don't know so we leave it open and
do no translation.
Sterling
[14/20] from: gchiu:compkarori at: 5-Dec-2000 9:43
On Mon, 4 Dec 2000 11:12:05 -0800
[sterling--rebol--com] wrote:
> Some FTP servers will not allow you to back out of your
> home directory
> but some do. The point is that we don't know so we leave
> it open and
> do no translation.
>
I think this is a cop out :-) I would mainly use a word
like clean-path when spidering a website, and not an ftp
site. The website will most likely use .. and . to point to
legal directories.
--
Graham Chiu
[15/20] from: sterling:rebol at: 4-Dec-2000 15:23
So where do yo urun into problems in the web spidering?
Sterling
[16/20] from: gchiu:compkarori at: 5-Dec-2000 13:37
On Mon, 4 Dec 2000 15:23:11 -0800
[sterling--rebol--com] wrote:
> So where do yo urun into problems in the web spidering?
>
Okay, this is a real world example. I need to often grab
product images from websites.
For example:
http://www.asus.com/Products/Addon/Vga/Agpv3800/index.html
You can see there that the images are referenced as
ball-yellow.gif - current directory
/Image/logo-title.gif - off the root directory
../../../Images/arrow.gif - up 3 directories
If 'clean-path worked on urls, that would make it much
easier. As it was, I wrote my reblet
http://www.compkarori.co.nz/reb/imagegrabber.r
before I even knew 'clean-path existed :-)
If you try out the above, perhaps you would enlighten me as
to why the slider on the side of the text-list doesn't
update <g>
--
Graham Chiu
[17/20] from: sterling:rebol at: 4-Dec-2000 20:12
Right. I see what you mean. Perhaps the more frequent use of
clean-path on a URL is this. We'll talk about it the next time we
meet about /Core fixes/enhancements/etc. The only issue may be that
we would be breaking code that relied on the current behavior... how
much of that there is I can't say.
The slider does not update on the text list unless you tell it to. I
use the following function in my code to update text-list sliders
every time I mod the list.
The function refers to list/lc. lc is a word in a text-list face that
is the number of visible lines of the list.
; updates the bar on the side of a text-list or group of text-lists
fix-slider: func [faces [object! block!]] [
foreach list to-block faces [
either 0 = length? list/data [list/sld/redrag 1] [
list/sld/redrag list/lc / length? list/data]
]
]
This way I can make changes to one or more text-lists in a layout, fix
the sliders, and then re-show the needed faces.
Sterling
[18/20] from: gchiu:compkarori at: 5-Dec-2000 21:25
On Mon, 4 Dec 2000 20:12:11 -0800
[sterling--rebol--com] wrote:
> meet about /Core fixes/enhancements/etc. The only issue
> may be that
> we would be breaking code that relied on the current
> behavior... how
> much of that there is I can't say.
Probably not a lot since there isn't much point using it on
urls at present :-)
> The slider does not update on the text list unless you
> tell it to. I
> use the following function in my code to update text-list
> sliders
> every time I mod the list.
thanks. Never would have worked that out myself.
--
Graham Chiu
[19/20] from: allenk:powerup:au at: 5-Dec-2000 22:56
Trawling through my archives I found this one from Bo. Hope its useful..
REBOL [
Title: "Clean HTTP path"
Date: 16-Sep-1999
Author: "Bohdan Lechnowsky"
Email: [bo--rebol--com]
File: %cleanhttp.r
Purpose: {
To remove /../ from within HTTP URLs. Will remove parent
directories for each /../ encountered, but will not remove
site information.
}
]
clean-parents: func [url][
url: parse url "/"
forall url [
while [url/1 = ".."] [
either (index? url) <= 4 [
remove url
][
remove/part back url 2
url: back url
]
]
]
to-url join "http" to-url remove head url
]
Cheers,
Allen K
[20/20] from: gchiu:compkarori at: 6-Dec-2000 7:54
On Tue, 5 Dec 2000 22:56:57 +1000
"Allen Kamp" <[allenk--powerup--com--au]> wrote:
> Trawling through my archives I found this one from Bo.
> Hope its useful..
<<quoted lines omitted: 7>>
> To remove /../ from within HTTP URLs. Will
> remove parent
Cool. Perhaps Bo should submit this as a useful mezzanine
function rather than overloading clean-path.
--
Graham Chiu
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted