[REBOL] Re: problems with url...
From: joel:neely:fedex at: 18-Jun-2002 17:28
Hi, Romano,
No really useful suggestions, but...
Romano Paolo Tenca wrote:
> Hi, Cyphre
>
> >I have this problem, how to 'read following url from rebol?
> >http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8
>
> read to-url "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
> == {<!doctype html public "-//w3c//dtd html 3.2 final//en">...
>
> But load fails:
>
> read load "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
> ** User Error: URL error: http://slovnik.nettown.cz/?co=naslepo&kde=A-È
> ** Near: read load "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
>
> I do not know why. Any ideas?
>
I notice that LOAD seems to be doing something interesting with that
percent-escaped character at the end, and possibly transforming it
into a character that's not legal for a URL (prematurely?).
>> read to-url "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
== {<!doctype html public "-//w3c//dtd html 3.2 final//en">
<!--
Copyright (C) 2000 Petr Kùra, [kura--nettown--cz]
All rights reserved...
>> read load "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
** User Error: URL error:
http://slovnik.nettown.cz/?co=naslepo&kde=A-È
** Near: read load "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
Notice that the message is
User Error: URL error: ...
and not
Syntax Error: Invalid url ...
as in
>> read load "http://%"
** Syntax Error: Invalid url -- http://%
** Near: (line 1) http://%
implying to my eye that LOAD was happy but the result wasn't usable
by READ.
When I say something like
>> gorp: "http://%77%77%77.rebol%2ecom/"
== "http://%77%77%77.rebol%2ecom/"
>> load gorp
== http://www.rebol.com/
LOAD seems to want to unescape the string. That's OK in this case,
since all of the escaped characters are actually valid in URLs, but
in the case of
>> bletch: "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
== "http://slovnik.nettown.cz/?co=naslepo&kde=A-%C8"
>> load bletch
== http://slovnik.nettown.cz/?co=naslepo&kde=A-È
the (premature) unescaping of "%C8" back to the high-bit-on accented-E
character may be the source of grief when that literal character is
deemed invalid for use in a URL.
Hope this helps!!!
-jn-