World: r3wp
[!REBOL2 Releases] Discuss 2.x releases
older newer | first last |
BrianH 2-Sep-2010 [2012] | Well, DECODE-URL should probably not dehex the username and password unless the (unknown to me) code that reassembles the url! can be changed to rehex them. As it should, but I don't know which code to fix. The scheme, host and path should not be dehexed in any case. |
Graham 2-Sep-2010 [2013] | Probably don't dehex anything in decode-url ...leave it to the schemes |
Maxim 2-Sep-2010 [2014] | true... since a scheme can potentially not even represent anything network related. |
BrianH 2-Sep-2010 [2015] | The problem is that while the scheme might not represent anything network-related, the standard for URI syntax is independent of network issues. And that standard is pretty strict about hex encoding, regardless of the scheme's internal rules. So schemes need to be hex-encoding-aware for their specs, whether they are network-related or not. |
Graham 2-Sep-2010 [2016] | parse-url should not dehex .. we can fix the rest in the schemes |
Maxim 2-Sep-2010 [2017x2] | brian, true. my error... I'm deep in calculus... my brain is a bit mushy ;-) IIRC the RFC has an BNF-style breakdown, so there should be no surprise as to where hexing can and should be interpreted. |
I did a fully compliant RFC URL parser which works better than the internal one... maybe I should look at it for more details... | |
BrianH 2-Sep-2010 [2019] | My only concern is that I don't know where is the code that reassembles the url! from the results of DECODE-URL. So I don't know how to fix any issues in it. |
Graham 2-Sep-2010 [2020] | the code is in the sdk |
BrianH 2-Sep-2010 [2021] | Which file? What is the function named? |
Graham 2-Sep-2010 [2022] | Why do you think it is reassembled? |
Maxim 2-Sep-2010 [2023] | IIRC its all in a context. |
Graham 2-Sep-2010 [2024] | All urls are deconstructed into an object |
BrianH 2-Sep-2010 [2025x3] | Right. But the problem I am trying to fix is this: >> http://user%40rebol.com:[blah-:-www-:-rebol-:-com]/ == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com]/ |
Somewhere in the middle of that process, DECODE-URL is called. What is called after that to reassemble the result into a url! value? | |
The dehex in that process is the one that we need to get rid of. | |
Graham 2-Sep-2010 [2028] | I suspect we don't have the source to that... |
Maxim 2-Sep-2010 [2029] | look in the URL-parser context within the prot-utils.r file. that is where the url decoding occurs. |
Graham 2-Sep-2010 [2030] | There's no 'dehex there |
Maxim 2-Sep-2010 [2031] | but I remember having the same issue a while back and traced it to the actual datatype always handling the hex values. |
Graham 2-Sep-2010 [2032x2] | I don't think it matters |
yes, Max .. | |
Maxim 2-Sep-2010 [2034] | just using the to-url created the same headaches... IIRC |
Graham 2-Sep-2010 [2035] | Brian's issue is not a problem |
BrianH 2-Sep-2010 [2036] | That is exactly where it matters. That is the whole problem. |
Graham 2-Sep-2010 [2037] | We can fix it without worrying about that part |
BrianH 2-Sep-2010 [2038] | That part is the only part that needs fixing. |
Maxim 2-Sep-2010 [2039x2] | brian .. I agree.. the hexing should stay in the url datatype until the actual network scheme requires to handle it. % characters are valid url so they should not get "fixed" |
%XX that is. | |
Graham 2-Sep-2010 [2041] | But you can't fix it because it's the way Rebol evaluates datatypes .. only Carl can fix that. |
Maxim 2-Sep-2010 [2042] | exactly. |
BrianH 2-Sep-2010 [2043] | Since when is that a constraint? |
Graham 2-Sep-2010 [2044] | 12 years I think now |
BrianH 2-Sep-2010 [2045] | Problems that only Carl can fix still need fixing. |
Maxim 2-Sep-2010 [2046x3] | so its a limitation in the URL datatype... akin to agressive error evaluation. |
so in REBOL speak, url dehexing should, be "relaxed" :-) | |
in my app, I ended up doing all URL manipulation in strings, and then just converting to url at the time of network call | |
BrianH 2-Sep-2010 [2049] | Or at least put off until it is appropriate to do. |
Maxim 2-Sep-2010 [2050x2] | IMHO the datatype can't know when. only the schemes and url processors know "when" is appropriate. |
the problem is when we are programatically managing uri. the datatype dehexing really gets in the way. | |
Graham 2-Sep-2010 [2052] | >> http://user%40rebol.com:[blah-:-www-:-rebol-:-com]/ == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com]/ >> a: to-url "http://user%40rebol.com:[blah-:-www-:-rebol-:-com]" == http://user%40rebol.com:[blah-:-www-:-rebol-:-com] >> a == http://user%40rebol.com:[blah-:-www-:-rebol-:-com] |
BrianH 2-Sep-2010 [2053x2] | It's really simple: The url! datatype should do no dehexing itself. The file! datatype can dehex, but not url!. Dehexing is only safe after decoding. |
Graham, thanks for narrowing it down. | |
Maxim 2-Sep-2010 [2055x2] | just tried a read, and when the second form of graham's test (using to-url on a string) the url parser doesn't dehex... so the username will be invalid. |
but I guess the server is responsible for dehexing in that case. | |
BrianH 2-Sep-2010 [2057x2] | Um, no. The HTTP standard for basic authentication doesn't hex-encode the user or password fields. The browser (or in our case, http scheme) does. |
Only the path is hex-encoded when passed to the server. | |
Maxim 2-Sep-2010 [2059x2] | ok so then the dehexing should be added in the url-parser and string notation used for @ containing passwords. just like we use string notation for files containing spaces. |
this could a workaround until Carl stops dehexing in the loading phase. | |
BrianH 2-Sep-2010 [2061] | Still haven't traced that. |
older newer | first last |