World: r3wp
[!REBOL2 Releases] Discuss 2.x releases
older newer | first last |
Graham 2-Sep-2010 [2016] | parse-url should not dehex .. we can fix the rest in the schemes |
Maxim 2-Sep-2010 [2017x2] | brian, true. my error... I'm deep in calculus... my brain is a bit mushy ;-) IIRC the RFC has an BNF-style breakdown, so there should be no surprise as to where hexing can and should be interpreted. |
I did a fully compliant RFC URL parser which works better than the internal one... maybe I should look at it for more details... | |
BrianH 2-Sep-2010 [2019] | My only concern is that I don't know where is the code that reassembles the url! from the results of DECODE-URL. So I don't know how to fix any issues in it. |
Graham 2-Sep-2010 [2020] | the code is in the sdk |
BrianH 2-Sep-2010 [2021] | Which file? What is the function named? |
Graham 2-Sep-2010 [2022] | Why do you think it is reassembled? |
Maxim 2-Sep-2010 [2023] | IIRC its all in a context. |
Graham 2-Sep-2010 [2024] | All urls are deconstructed into an object |
BrianH 2-Sep-2010 [2025x3] | Right. But the problem I am trying to fix is this: >> http://user%40rebol.com:[blah-:-www-:-rebol-:-com]/ == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com]/ |
Somewhere in the middle of that process, DECODE-URL is called. What is called after that to reassemble the result into a url! value? | |
The dehex in that process is the one that we need to get rid of. | |
Graham 2-Sep-2010 [2028] | I suspect we don't have the source to that... |
Maxim 2-Sep-2010 [2029] | look in the URL-parser context within the prot-utils.r file. that is where the url decoding occurs. |
Graham 2-Sep-2010 [2030] | There's no 'dehex there |
Maxim 2-Sep-2010 [2031] | but I remember having the same issue a while back and traced it to the actual datatype always handling the hex values. |
Graham 2-Sep-2010 [2032x2] | I don't think it matters |
yes, Max .. | |
Maxim 2-Sep-2010 [2034] | just using the to-url created the same headaches... IIRC |
Graham 2-Sep-2010 [2035] | Brian's issue is not a problem |
BrianH 2-Sep-2010 [2036] | That is exactly where it matters. That is the whole problem. |
Graham 2-Sep-2010 [2037] | We can fix it without worrying about that part |
BrianH 2-Sep-2010 [2038] | That part is the only part that needs fixing. |
Maxim 2-Sep-2010 [2039x2] | brian .. I agree.. the hexing should stay in the url datatype until the actual network scheme requires to handle it. % characters are valid url so they should not get "fixed" |
%XX that is. | |
Graham 2-Sep-2010 [2041] | But you can't fix it because it's the way Rebol evaluates datatypes .. only Carl can fix that. |
Maxim 2-Sep-2010 [2042] | exactly. |
BrianH 2-Sep-2010 [2043] | Since when is that a constraint? |
Graham 2-Sep-2010 [2044] | 12 years I think now |
BrianH 2-Sep-2010 [2045] | Problems that only Carl can fix still need fixing. |
Maxim 2-Sep-2010 [2046x3] | so its a limitation in the URL datatype... akin to agressive error evaluation. |
so in REBOL speak, url dehexing should, be "relaxed" :-) | |
in my app, I ended up doing all URL manipulation in strings, and then just converting to url at the time of network call | |
BrianH 2-Sep-2010 [2049] | Or at least put off until it is appropriate to do. |
Maxim 2-Sep-2010 [2050x2] | IMHO the datatype can't know when. only the schemes and url processors know "when" is appropriate. |
the problem is when we are programatically managing uri. the datatype dehexing really gets in the way. | |
Graham 2-Sep-2010 [2052] | >> http://user%40rebol.com:[blah-:-www-:-rebol-:-com]/ == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com]/ >> a: to-url "http://user%40rebol.com:[blah-:-www-:-rebol-:-com]" == http://user%40rebol.com:[blah-:-www-:-rebol-:-com] >> a == http://user%40rebol.com:[blah-:-www-:-rebol-:-com] |
BrianH 2-Sep-2010 [2053x2] | It's really simple: The url! datatype should do no dehexing itself. The file! datatype can dehex, but not url!. Dehexing is only safe after decoding. |
Graham, thanks for narrowing it down. | |
Maxim 2-Sep-2010 [2055x2] | just tried a read, and when the second form of graham's test (using to-url on a string) the url parser doesn't dehex... so the username will be invalid. |
but I guess the server is responsible for dehexing in that case. | |
BrianH 2-Sep-2010 [2057x2] | Um, no. The HTTP standard for basic authentication doesn't hex-encode the user or password fields. The browser (or in our case, http scheme) does. |
Only the path is hex-encoded when passed to the server. | |
Maxim 2-Sep-2010 [2059x2] | ok so then the dehexing should be added in the url-parser and string notation used for @ containing passwords. just like we use string notation for files containing spaces. |
this could a workaround until Carl stops dehexing in the loading phase. | |
BrianH 2-Sep-2010 [2061] | Still haven't traced that. |
Maxim 2-Sep-2010 [2062] | >> c: load "http://user%40rebol.com:[blah-:-www-:-rebol-:-com]" == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com] |
BrianH 2-Sep-2010 [2063] | No, I mean I haven't traced it. LOAD calls other functions, maybe even in R2. |
Chris 2-Sep-2010 [2064] | The 'solution' is: read [scheme: 'http user: "[user-:-rebol-:-com]" pass: .........] : ) |
Maxim 2-Sep-2010 [2065] | but that's not a uri ;-) the point of the url datatype is that we shoudn't need to use "specifications" but uri paths. |
older newer | first last |