[REBOL] Re: problems with url...(and another "escaped" problem)
From: rotenca::telvia::it at: 19-Jun-2002 23:54
Hi Scott, Hingo and all,
I have rethinked to the whole question, and now i have more clear mind (i
hope).
> The translation is "immediate." This suggests a big at the interpreter
> level for url! (and same for file!, but the error is even less likely to
> surface due to file naming conventions).
This is Load.
> Other escape characters escape this immediate translation. Witness (with
> artificial url):
>
> >> http://a/
> == http://a/
1) Url! are like file!.
2) The escape sequence is translated like the others. It does not really
exist
in the string:
x: h:/ ; == h:/
type? x ; == url!
form x == "h:/ "
last x ; == #" "
length? x ;== 4
the loaded url does not contain the char sequence " " but a real space.
3) Load convert ALL the %xx values it finds in the string.
4) It is probe/mold which shows the form %xx, because it knows that x is a
url! (or a file!).
probe x; == h:/
mold x == "h:/ "
length? mold x ;== 6
5) But molding a file or url reconverts to the form %xx only the decimals
chars:
[
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
32 34 40 41 59 91 93
123 125 127 160
]
An URL, indeed, requires for some section/schemes that all the non alphanum
(non a-z A-Z and little more) chars are escaped. Instead, when you mold an url
with a "È" char, mold does not show %c8 but "È".
6) In the expression:
length? to-url "h:/ " ; == 6
the sequence " " is inserted as-is in the string.
7) Escaping the % with %25 and loading the url has the same effect
length? h:/%2520 ; == 6
>The answer lies within the http scheme. At one point the "target"
>is recreated out of components:
>target: next mold to-file join (join "/" either found? port/path
>[port/path] [""]) either found? port/target [port/target] [""]
> however, the to-file action retranslates it back to
> an escape!
Is the [ mold to-file ] expression that re-creates the "%xx" string, but this
works only for the already listed chars.
Others chars like %c8 does not "re-appear".
> Later, the scheme wishes to check for the need for using a proxy:
>
> http-packet: reform [http-command either generic-proxy? [port/url] [target]
> http-version]
>
> Here, the generic-proxy version utilizes the "spaced" version; whereas, the
> non-proxy version uses the newly created escaped version (that is a path and
> target only). A hacked fix would be sure that that the generic-proxy gets a
> similarly re-escaped version, which could be done several ways, probably
> most cleverly using Ingo's idea.
The problem is that we have no mold support for chars like %c8 and also
target
is wrong for these chars. A solution could be parsing the string and
escape the missed chars or escape all the chars (should be safe).
Another workaround is to store always the url in a string and call to-url
before open/read/write, but this limits the url! datatypes usage.
The final solution should be a url! datatypes (and mold) which knows how to
handle the escaped chars in the different section with the different schemes,
but i think it is very difficult to write.
> Hope this clarifies the additional mystery.
It helped me. Thanks!
---
Ciao
Romano