Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: URL handling

From: holger::rebol::com at: 21-Sep-2001 14:21

On Fri, Sep 21, 2001 at 09:05:33PM +0200, Hallvard Ystad wrote:
> The specs are about to be changed. I know it's still in some kind of beta state, but international characters are about to be allowed in URLs. As an example (I take it from your name that you're danish, Holger),
German, actually.
> Yes, but there's one thing to keep in mind. The following does NOT work: > > print read to-url to-string
Of course not. It is equivalent to print read http:///.. The to-url and to-string calls only change the type, not the contents of the URL.
> because rebol identifies the url as an url and interprets it the wrong way
REBOL uses % for escaping special characters in URLs. The URL does not behave the way you want for the same reason that the string abc^/def does not contain the characters ^ and /. In that case you need to escape the ^ by entering "abc^^/def" to get the expected result. The same is true for URLs, only it is the % that has to be escaped, leading to %25F8 instead of %F8.
> before my to-string is evaluated. So if one receives a url through a referencing word, say 'my-word, then one has to get the string with something like > > my-string: rejoin [{"} my-word {"}] > > before converting it to a URL.
What you are saying is a little confusing... Is 'my-word of type url! ? In that case you don't have to convert anything. It contains what you want. Is it of type string! ? In that case you don't need the quotes, just use to-url on the string. You only run into problems if you run a URL that does not contain the required % escaping through 'load, 'do or any other function that uses the scanner, e.g. to-string when the argument is a block. You will encounter the same problem if you execute, say, to-string ["ab^/de"], and really want the ^ and / characters in the string. The point to remember is that any time you run a sequence of characters through the scanner, REBOL will handle escape characters. This means if you know that the input does not contain the escaping required by REBOL, but literal, unescaped characters, then only use functions that do not use the scanner -- or insert the escaping yourself before calling the scanner. If you need to convert a URL which is embedded into a larger string and does not contain proper escaping to a url! type then do not use 'load. Just pass the substring you need to to-url. That way the URL is not scanned and thus not changed. This is not a bug. All scanners that allow escaping behave that way. Only use a scanner if the input complies with the escaping conventions used by the scanner. -- Holger Kruse [holger--rebol--com]