clean-path

[1/20] from: gchiu::compkarori::co::nz at: 4-Dec-2000 15:49

Hi, What does clean-path do? I passed urls with .. and . in them to it, and they are unchanged. -- Graham Chiu

[2/20] from: timewarp:sirius at: 3-Dec-2000 19:13

Graham Chiu wrote:

> Hi, > > What does clean-path do? I passed urls with .. and . in > them to it, and they are unchanged.

did you try assigning the returned value to a word?

[3/20] from: timewarp:sirius at: 3-Dec-2000 19:23

Graham Chiu wrote:

> Hi, > > What does clean-path do? I passed urls with .. and . in > them to it, and they are unchanged.

hey, you know what? it really doesn't seem to work.

[4/20] from: al:bri:xtra at: 4-Dec-2000 16:30

> What does clean-path do? I passed urls with .. and . in them to it, and

they are unchanged.

>> source clean-path

clean-path: func [ {Cleans-up '.' and '..' in path; returns the cleaned path.} target [file! url!] /local item file path mark mark2 marks dots dot dotdot target-copy][ if url? target [return target] ... Andrew Martin I have no words to speak, but I must rebol... ICQ: 26227169 http://members.nbci.com/AndrewMartin/

[5/20] from: arolls:bigpond:au at: 4-Dec-2000 14:49

You're right, The following works (on local files):

>> what-dir

== %/D/Anton/Dev/Rebol/view/

>> clean-path %..

== %/D/Anton/Dev/Rebol/ The following doesn't work (on urls):

>> clean-path http://users.bigpond.net.au/datababies/anton/../

== http://users.bigpond.net.au/datababies/anton/../

>> clean-path http://users.bigpond.net.au/datababies/anton/..

== http://users.bigpond.net.au/datababies/anton/.. 'clean-path claims that it accepts urls, so I guess the feature was forgotten to be implemented. You should report it to feedback, I reckon. Anton.

[6/20] from: al:bri:xtra at: 4-Dec-2000 17:47

Anton wrote:

> 'clean-path claims that it accepts urls, so I guess the feature was

forgotten to be implemented.

> You should report it to feedback, I reckon.

I wouldn't. Where does this URL point to: http://members.nbci.com/AndrewMartin/../ ? Could Rebol have guessed: http://www.nbci.com/ ? YMMV, depending upon the site you access with ../. Andrew Martin ICQ: 26227169 http://members.nbci.com/AndrewMartin/

[7/20] from: gchiu:compkarori at: 4-Dec-2000 19:23

On Mon, 4 Dec 2000 17:47:11 +1300 "Andrew Martin" <[Al--Bri--xtra--co--nz]> wrote:

> Anton wrote: > > 'clean-path claims that it accepts urls, so I guess the

<<quoted lines omitted: 8>>

> ? > YMMV, depending upon the site you access with ../.

Or perhaps the version of Rebol one is using.

>> clean-path http://members.nbci.com/AndrewMartin/../

== http://members.nbci.com/AndrewMartin/../

>> rebol/version

== 0.10.38.3.1 -- Graham Chiu

[8/20] from: gchiu:compkarori at: 4-Dec-2000 19:32

On Mon, 4 Dec 2000 14:49:07 +1100 "Anton" <[arolls--bigpond--net--au]> wrote:

> 'clean-path claims that it accepts urls, so I guess the > feature was forgotten to be implemented. > You should report it to feedback, I reckon.

Hi Anton, You're right. This is from the source

>> source clean-path

clean-path: func [ {Cleans-up '.' and '..' in path; returns the cleaned path.} target [file! url!] /local item file path mark mark2 marks dots dot dotdot target-copy][ if url? target [return target] So, if the parameter is a url, it just returns it unchanged. Guess it was just forgot about. I just noticed this while trying to figure out why Bo's webcrawler.r doesn't work. I'll send it to the feedback bot. -- Graham Chiu

[9/20] from: al:bri:xtra at: 4-Dec-2000 20:17

> > YMMV, depending upon the site you access with ../. > > > > Or perhaps the version of Rebol one is using.

Check it out in a browser! :-) Andrew Martin ICQ: 26227169 http://members.nbci.com/AndrewMartin/

[10/20] from: arolls:bigpond:au at: 4-Dec-2000 19:43

> Anton wrote: > > 'clean-path claims that it accepts urls, so I guess the feature was

<<quoted lines omitted: 6>>

> http://www.nbci.com/ > ?

Maybe it can. If you look at a trace/net of read http://members.nbci.com/AndrewMartin/../ you can see (many of): Net-log: "HTTP/1.1 302 Found" URL Parse: none none www.nbci.com none none none Net-log: ["Opening tcp for" HTTP] Net-log: {GET / HTTP/1.0 Accept: */* Connection: close User-Agent: REBOL 0.10.38.3.1 Host: members.nbci.com Extracting the above www.nbci.com is a little bit more complicated than just operating the same as local paths but we should at least try, hey? Anyway, local paths can lead to trouble too, can't they? What about symbolic links and volumes mounted as /blah in unix-land? eg. /usr/../../ Where does that take you? (Is that a bad example?) So I still think it should be reported. This has been a code-free document. Anton.

[11/20] from: g:santilli:tiscalinet:it at: 4-Dec-2000 13:02

Erin Thomas wrote:

> > What does clean-path do? I passed urls with .. and . in > > them to it, and they are unchanged. > > hey, you know what? it really doesn't seem to work.

It works on files only. ../ are significant in URLS sometimes, so CLEAN-PATH simply doesn't remove them. You could write your own function that is smart enough to detect those situations... Regards, Gabriele. -- Gabriele Santilli <[giesse--writeme--com]> - Amigan - REBOL programmer Amiga Group Italia sez. L'Aquila -- http://www.amyresource.it/AGI/

[12/20] from: gchiu:compkarori at: 5-Dec-2000 7:33

On Mon, 04 Dec 2000 13:02:07 +0100 Gabriele Santilli <[g--santilli--tiscalinet--it]> wrote:

> It works on files only. ../ are significant in URLS > sometimes, so > CLEAN-PATH simply doesn't remove them. You could write > your own > function that is smart enough to detect those > situations...

I noticed that Bo used it in his webcrawler.r, so I guess someone at Rebol thought it worked on urls :-) -- Graham Chiu

[13/20] from: sterling:rebol at: 4-Dec-2000 11:12

The main reason that URLs are not translated by clean-path is basically that you do not really know what directory transaltion is in effect on the contacted site. Take FTP as an example: read ftp://user1:[pass--example--com]/readme.txt This file will most likely reside in /home/user1/readme.txt. Now lets say that user1 has shared their directory and this file for others to read. So user2 wants to get it: read ftp://user2:[pass2--example--com]/../user1/readme.txt This will work because the "root" directory that is specified by ftp://user2:[pass2--example--com]/ is actually in /home/user2/. The problem with cleaning the URL is this: clean-path ftp://user2:[pass2--example--com]/../user1/readme.txt would come out as: ftp://user2:[pass2--example--com]/user1/readme.txt which is clearly not right. Some FTP servers will not allow you to back out of your home directory but some do. The point is that we don't know so we leave it open and do no translation. Sterling

[14/20] from: gchiu:compkarori at: 5-Dec-2000 9:43

On Mon, 4 Dec 2000 11:12:05 -0800 [sterling--rebol--com] wrote:

> Some FTP servers will not allow you to back out of your > home directory > but some do. The point is that we don't know so we leave > it open and > do no translation. >

I think this is a cop out :-) I would mainly use a word like clean-path when spidering a website, and not an ftp site. The website will most likely use .. and . to point to legal directories. -- Graham Chiu

[15/20] from: sterling:rebol at: 4-Dec-2000 15:23

So where do yo urun into problems in the web spidering? Sterling

[16/20] from: gchiu:compkarori at: 5-Dec-2000 13:37

On Mon, 4 Dec 2000 15:23:11 -0800 [sterling--rebol--com] wrote:

> So where do yo urun into problems in the web spidering? >

Okay, this is a real world example. I need to often grab product images from websites. For example: http://www.asus.com/Products/Addon/Vga/Agpv3800/index.html You can see there that the images are referenced as ball-yellow.gif - current directory /Image/logo-title.gif - off the root directory ../../../Images/arrow.gif - up 3 directories If 'clean-path worked on urls, that would make it much easier. As it was, I wrote my reblet http://www.compkarori.co.nz/reb/imagegrabber.r before I even knew 'clean-path existed :-) If you try out the above, perhaps you would enlighten me as to why the slider on the side of the text-list doesn't update <g> -- Graham Chiu

[17/20] from: sterling:rebol at: 4-Dec-2000 20:12

Right. I see what you mean. Perhaps the more frequent use of clean-path on a URL is this. We'll talk about it the next time we meet about /Core fixes/enhancements/etc. The only issue may be that we would be breaking code that relied on the current behavior... how much of that there is I can't say. The slider does not update on the text list unless you tell it to. I use the following function in my code to update text-list sliders every time I mod the list. The function refers to list/lc. lc is a word in a text-list face that is the number of visible lines of the list. ; updates the bar on the side of a text-list or group of text-lists fix-slider: func [faces [object! block!]] [ foreach list to-block faces [ either 0 = length? list/data [list/sld/redrag 1] [ list/sld/redrag list/lc / length? list/data] ] ] This way I can make changes to one or more text-lists in a layout, fix the sliders, and then re-show the needed faces. Sterling

[18/20] from: gchiu:compkarori at: 5-Dec-2000 21:25

On Mon, 4 Dec 2000 20:12:11 -0800 [sterling--rebol--com] wrote:

> meet about /Core fixes/enhancements/etc. The only issue > may be that > we would be breaking code that relied on the current > behavior... how > much of that there is I can't say.

Probably not a lot since there isn't much point using it on urls at present :-)

> The slider does not update on the text list unless you > tell it to. I > use the following function in my code to update text-list > sliders > every time I mod the list.

thanks. Never would have worked that out myself. -- Graham Chiu

[19/20] from: allenk:powerup:au at: 5-Dec-2000 22:56

Trawling through my archives I found this one from Bo. Hope its useful.. REBOL [ Title: "Clean HTTP path" Date: 16-Sep-1999 Author: "Bohdan Lechnowsky" Email: [bo--rebol--com] File: %cleanhttp.r Purpose: { To remove /../ from within HTTP URLs. Will remove parent directories for each /../ encountered, but will not remove site information. } ] clean-parents: func [url][ url: parse url "/" forall url [ while [url/1 = ".."] [ either (index? url) <= 4 [ remove url ][ remove/part back url 2 url: back url ] ] ] to-url join "http" to-url remove head url ] Cheers, Allen K

[20/20] from: gchiu:compkarori at: 6-Dec-2000 7:54

On Tue, 5 Dec 2000 22:56:57 +1000 "Allen Kamp" <[allenk--powerup--com--au]> wrote:

> Trawling through my archives I found this one from Bo. > Hope its useful..

<<quoted lines omitted: 7>>

> To remove /../ from within HTTP URLs. Will > remove parent

Cool. Perhaps Bo should submit this as a useful mezzanine function rather than overloading clean-path. -- Graham Chiu

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted