World: r3wp

Join the discussions in the REBOL3 world...

[!REBOL2 Releases] Discuss 2.x releases

older newer	first last
Graham 2-Sep-2010 [2016]	parse-url should not dehex .. we can fix the rest in the schemes
Maxim 2-Sep-2010 [2017x2]	brian, true. my error... I'm deep in calculus... my brain is a bit mushy ;-) IIRC the RFC has an BNF-style breakdown, so there should be no surprise as to where hexing can and should be interpreted.
Maxim 2-Sep-2010 [2017x2]	I did a fully compliant RFC URL parser which works better than the internal one... maybe I should look at it for more details...
BrianH 2-Sep-2010 [2019]	My only concern is that I don't know where is the code that reassembles the url! from the results of DECODE-URL. So I don't know how to fix any issues in it.
Graham 2-Sep-2010 [2020]	the code is in the sdk
BrianH 2-Sep-2010 [2021]	Which file? What is the function named?
Graham 2-Sep-2010 [2022]	Why do you think it is reassembled?
Maxim 2-Sep-2010 [2023]	IIRC its all in a context.
Graham 2-Sep-2010 [2024]	All urls are deconstructed into an object
BrianH 2-Sep-2010 [2025x3]	Right. But the problem I am trying to fix is this: >> http://user%40rebol.com:[blah-:-www-:-rebol-:-com]/ == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com]/
	Somewhere in the middle of that process, DECODE-URL is called. What is called after that to reassemble the result into a url! value?
	The dehex in that process is the one that we need to get rid of.
Graham 2-Sep-2010 [2028]	I suspect we don't have the source to that...
Maxim 2-Sep-2010 [2029]	look in the URL-parser context within the prot-utils.r file. that is where the url decoding occurs.
Graham 2-Sep-2010 [2030]	There's no 'dehex there
Maxim 2-Sep-2010 [2031]	but I remember having the same issue a while back and traced it to the actual datatype always handling the hex values.
Graham 2-Sep-2010 [2032x2]	I don't think it matters
Graham 2-Sep-2010 [2032x2]	yes, Max ..
Maxim 2-Sep-2010 [2034]	just using the to-url created the same headaches... IIRC
Graham 2-Sep-2010 [2035]	Brian's issue is not a problem
BrianH 2-Sep-2010 [2036]	That is exactly where it matters. That is the whole problem.
Graham 2-Sep-2010 [2037]	We can fix it without worrying about that part
BrianH 2-Sep-2010 [2038]	That part is the only part that needs fixing.
Maxim 2-Sep-2010 [2039x2]	brian .. I agree.. the hexing should stay in the url datatype until the actual network scheme requires to handle it. % characters are valid url so they should not get "fixed"
Maxim 2-Sep-2010 [2039x2]	%XX that is.
Graham 2-Sep-2010 [2041]	But you can't fix it because it's the way Rebol evaluates datatypes .. only Carl can fix that.
Maxim 2-Sep-2010 [2042]	exactly.
BrianH 2-Sep-2010 [2043]	Since when is that a constraint?
Graham 2-Sep-2010 [2044]	12 years I think now
BrianH 2-Sep-2010 [2045]	Problems that only Carl can fix still need fixing.
Maxim 2-Sep-2010 [2046x3]	so its a limitation in the URL datatype... akin to agressive error evaluation.
	so in REBOL speak, url dehexing should, be "relaxed" :-)
	in my app, I ended up doing all URL manipulation in strings, and then just converting to url at the time of network call
BrianH 2-Sep-2010 [2049]	Or at least put off until it is appropriate to do.
Maxim 2-Sep-2010 [2050x2]	IMHO the datatype can't know when. only the schemes and url processors know "when" is appropriate.
Maxim 2-Sep-2010 [2050x2]	the problem is when we are programatically managing uri. the datatype dehexing really gets in the way.
Graham 2-Sep-2010 [2052]	>> http://user%40rebol.com:[blah-:-www-:-rebol-:-com]/ == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com]/ >> a: to-url "http://user%40rebol.com:[blah-:-www-:-rebol-:-com]" == http://user%40rebol.com:[blah-:-www-:-rebol-:-com] >> a == http://user%40rebol.com:[blah-:-www-:-rebol-:-com]
BrianH 2-Sep-2010 [2053x2]	It's really simple: The url! datatype should do no dehexing itself. The file! datatype can dehex, but not url!. Dehexing is only safe after decoding.
BrianH 2-Sep-2010 [2053x2]	Graham, thanks for narrowing it down.
Maxim 2-Sep-2010 [2055x2]	just tried a read, and when the second form of graham's test (using to-url on a string) the url parser doesn't dehex... so the username will be invalid.
Maxim 2-Sep-2010 [2055x2]	but I guess the server is responsible for dehexing in that case.
BrianH 2-Sep-2010 [2057x2]	Um, no. The HTTP standard for basic authentication doesn't hex-encode the user or password fields. The browser (or in our case, http scheme) does.
BrianH 2-Sep-2010 [2057x2]	Only the path is hex-encoded when passed to the server.
Maxim 2-Sep-2010 [2059x2]	ok so then the dehexing should be added in the url-parser and string notation used for @ containing passwords. just like we use string notation for files containing spaces.
Maxim 2-Sep-2010 [2059x2]	this could a workaround until Carl stops dehexing in the loading phase.
BrianH 2-Sep-2010 [2061]	Still haven't traced that.
Maxim 2-Sep-2010 [2062]	>> c: load "http://user%40rebol.com:[blah-:-www-:-rebol-:-com]" == http://[user-:-rebol-:-com]:[blah-:-www-:-rebol-:-com]
BrianH 2-Sep-2010 [2063]	No, I mean I haven't traced it. LOAD calls other functions, maybe even in R2.
Chris 2-Sep-2010 [2064]	The 'solution' is: read [scheme: 'http user: "[user-:-rebol-:-com]" pass: .........] : )
Maxim 2-Sep-2010 [2065]	but that's not a uri ;-) the point of the url datatype is that we shoudn't need to use "specifications" but uri paths.
older newer	first last