[load vs load/all] [how to//handle untrusted data] load v load/all - CGI security & word consumption
[1/5] from: pwawood::mango::net::my at: 17-Nov-2004 14:12
Whilst tagging the mailing list archive, I came across this important
question to which there seems to be no response.
The basic premises are that it's more secure to use load/all when
reading "untrusted" data and that using load/all eats up available
words.
Are this true?
Is there a better way of handling "untrusted" data?
Regards
Peter
The original message :
Hi there,
Jeff (Rebol Technologies)(I think) in Zine/4 wrote:
====
In fact, LOAD/all is the safest LOAD and you should use it when ever
LOADing a string or file from an untrusted source (like CGI, for
instance).
<snip>
LOAD/all will always give you a block where as LOAD will give you a
single item if there is only one item. LOAD/all always produces a block
as a convenience because it is the "paranoid" LOAD. Whatever you give
LOAD/all, it always gives you an unevaluated block of that thing. So if
you do:
error? try [load/all some-random-string]
you can't go wrong. LOAD/all you can.
====
That's good advice, and it showed me how i had a security flaw in my
code...Just doing a Load on a CGI field is a route to an immediate
shutdown if the field contains "Rebol [Quit]".
But it seems to be a ticking timebomb .... Each Load/All uses up (at
least) one entry in System/words -- e.g.
loop 2000 [
load/all join "A" [Random 50000]
print length? first system/words
]
When First System/words hits 4095 (or thereabouts: I believe the number
differs across systems), my 24x7 application goes down like a Microsoft
server.
I'm using Load/All to convert a untrusted string into a date or decimal
or string. Does anyone have a workaround for its unwanted behavior? Or
am I writing the function 'ConvertUntrusted ?
--Thanks,
--Colin.
[2/5] from: gabriele::colellachiara::com at: 17-Nov-2004 11:59
Hi Peter,
On Wednesday, November 17, 2004, 7:12:42 AM, you wrote:
PWW> The basic premises are that it's more secure to use load/all when
PWW> reading "untrusted" data and that using load/all eats up available
PWW> words.
The first assumption is only true in older versions of REBOL; the
second is false, not in the sense that new words do not use space,
but in the sense that this happens with a normal LOAD too. Also,
in newer versions of REBOL the word limit has been increased, so
unless you are doing something like the example you provided and
intentionally creating a lot of different words, this is not a
problem.
Note that using TO BLOCK! instead of LOAD you do not use space in
the global context because it does not bind the words. This is
probably safer than LOAD for untrusted data because since the
words are not bound, you don't risk anything even if you
accidentally evaluate them.
>> to block! "quit"
== [quit]
>> do to block! "quit"
** Script Error: quit word has no context
** Near: quit
Regards,
Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amiga Group Italia sez. L'Aquila --- SOON: http://www.rebol.it/
[3/5] from: SunandaDH::aol::com at: 17-Nov-2004 8:49
Re: [load vs load/all] [how to//handle untrusted data] load v loa...
Peter:
> The basic premises are that it's more secure to use load/all when
> reading "untrusted" data and that using load/all eats up available
> words.
Gabriele:
> The first assumption is only true in older versions of REBOL;
Not quite......The first assumption is true in the current production release
of REBOL/View:
-- dowload from this page:
http://www.rebol.com/view-platforms.html
-- and then try:
load "rebol [quit]"
There is a note on the web page advising people to download newer betas.
But the existence of a newer beta does not make the existing production
release an "older version" -- it remains the current, official version until
actually replaced by RT.
RT seem to be very slow in getting around to that. But there must be reason
for it. So any company thinking of using REBOL is probably better off using the
official releases.
So, for code using the official version, 'load/all is the safe alternative to
'load.
As you say, both 'load/all and 'load use up part of the finite space in
system/words, so 'to-block is better in many cases.
In all cases, and all versions of REBOL, wrap the untrusted 'load etc in an
error/try block because they will fail on some strings, eg:
load "]"
load/all "]"
to-block "]"
There still remains the problem (in all versions of REBOL) that 'load or 'do
of code will use up part of the finite space in system/words, and this space
is unrecoverable as far as I know.
That places a limit on the size of REBOL applications, or the length of time
they can run if they evaluate console input.
Sunanda.
[4/5] from: gabriele::colellachiara::com at: 17-Nov-2004 18:55
Hi SunandaDH,
On Wednesday, November 17, 2004, 2:49:12 PM, you wrote:
Sac> Not quite......The first assumption is true in the current production release
Sac> of REBOL/View:
Release version is a older version of REBOL for me. (Actually, an
obsolete version of REBOL for me.) Don't blame me for this, that's
really too old for any practical purpose. :)
Sac> RT seem to be very slow in getting around to that. But there must be reason
Sac> for it. So any company thinking of using REBOL is
Sac> probably better off using the
Sac> official releases.
I disagree. There are a number if reasons for the betas not being
promoted to official release, but a company using REBOL is
probably using the SDK which is at least View 1.2.10. So unless
they manage to buy an older version of REBOL, they get 1.2.10 when
they buy the SDK.
Regards,
Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amiga Group Italia sez. L'Aquila --- SOON: http://www.rebol.it/
[5/5] from: pwawood::mango::net::my at: 18-Nov-2004 12:09
Perhaps I can summarise Gabriele's and Sunanda's helpful advice on
handling "untrusted" data :
1. Data that has not been validated may, accidentally or maliciously,
include invalid or valid Rebol code. It needs to be treated with care.
2. The safest option is to use "to block!" or "to-block" as it does not
bind the words so they cannot be accidentally evaluated. For example :
>> to block! "quit"
== [quit]
>> do to block! "quit"
** Script Error: quit word has no context
** Near: quit
It is possible to reduce the number of system words consumed by using
the "to" approach rather than "load". For example
>> length? first system/words
== 1246
>> do to block! "val1"
** Script Error: val1 word has no context
** Near: val1
>> length? first system/words
== 1246
>> do load "val2"
** Script Error: val2 has no value
** Near: do load "val2"
>> length? first system/words
== 1247
>> do load/all "val3"
** Script Error: val3 has no value
** Near: val3
>> length? first system/words
== 1248
3. Load/all is safer than Load with older versions of Rebol including
the current official View release 1.2.1.
4. It is advisable to wrap the to-block or load of untrusted data in an
error/try block as some strings will give problems. For example:
>> load "]"
** Syntax Error: Missing [ at end-of-block
** Near: (line 1) ]
>> load/all "]"
** Syntax Error: Missing [ at end-of-block
** Near: (line 1) ]
>> to block! "]"
** Syntax Error: Missing [ at end-of-block
** Near: (line 1) ]
>> error? try [load/all "]"]
== true
Please let me know if I have summarised this incorrectly.
Regards
Peter