More dialects than you think, validating function input.
[1/12] from: brett:codeconscious at: 12-Sep-2002 22:05
I have a function that takes a block! as an argument. This block! should
contain only numbers. The question is how do I check for valid input and
throw an error on encountering a problem in an efficient manner?
My first thought was something like this:
f1: function [
[catch]
block [block!]
] [kount] [
kount: 0
repeat value block [
if not number? value [
throw make error! "F1 can only process numbers."
]
kount: kount + 1
]
]
After reflecting on this, I thought about how a block! passed to a function
is one sense is being interpreted by the function according to an implied
grammar. That is my requirement in the above function for only numbers, in a
sense, implies that my function is processing a fairly simple REBOL based
grammar (ie dialect). I don't think it is too outlandish to say every
function that processes a block! is implementing grammar processing even if
trivial. Hence, the subject title of this message.
As a side note, I reckon every offline script can be considered as an out of
memory block! ready for loading.
Anyway, this thought leads me to PARSE, REBOL's grammar validator. So I
recoded my function to look something like this:
f2: function [
[catch]
block [block!]
] [kount value] [
kount: 0
if not parse block [
any [set value number! (kount: kount + 1)] |
set value any! to end
] [throw make error! "F2 can only process numbers."]
]
Then I did some quick and dirty testing to see which was faster. On my
machine, F2 takes half the time that F1 does.
Just thought I'd share my little ramble. Test code below.
Cheers,
Brett.
REBOL []
data: copy []
repeat i 2000 [insert tail data i]
f1: function [
[catch]
block [block!]
] [kount] [
kount: 0
repeat value block [
if not number? value [
throw make error! "F1 can only process numbers."
]
kount: kount + 1
]
]
f2: function [
[catch]
block [block!]
] [kount value] [
kount: 0
if not parse block [
any [set value number! (kount: kount + 1)] |
set value any! to end
] [throw make error! "F2 can only process numbers."]
]
timeit: func [f] [
t1: now/precise
repeat i 300 [error? try [f data]]
t2: now/precise
t2/time - t1/time
]
print ["f1" timeit :f1]
print ["f2" timeit :f2]
insert at data 1001 'a-word
print ["f1 - error half way" timeit :f1]
print ["f2 - error half way" timeit :f2]
---
Website: http://www.codeconscious.com
Rebsite: http://www.codeconscious.com/index.r
[2/12] from: brett:codeconscious at: 12-Sep-2002 22:10
Tiny correction to my code so as not to confuse. It does not change the
result or conclusion.
set value any! to end
should read
set value any-type! to end
Brett.
[3/12] from: lmecir:mbox:vol:cz at: 12-Sep-2002 15:05
Hi Brett,
I don't know, why you didn't write it as follows:
f3: func [
[catch]
block [block!]
] [
if not parse block [any [number!] end] [
throw make error! "F3 can only process numbers."
]
]
The parse rule contains END only for compatibility with my proposed PARSE
behaviour, otherwise it isn't necessary.
Cheers
-L
----- Original Message -----
From: "Brett Handley" <[brett--codeconscious--com]>
To: <[rebol-list--rebol--com]>
Sent: Thursday, September 12, 2002 2:05 PM
Subject: [REBOL] More dialects than you think, validating function input.
I have a function that takes a block! as an argument. This block! should
contain only numbers. The question is how do I check for valid input and
throw an error on encountering a problem in an efficient manner?
My first thought was something like this:
f1: function [
[catch]
block [block!]
] [kount] [
kount: 0
repeat value block [
if not number? value [
throw make error! "F1 can only process numbers."
]
kount: kount + 1
]
]
After reflecting on this, I thought about how a block! passed to a function
is one sense is being interpreted by the function according to an implied
grammar. That is my requirement in the above function for only numbers, in a
sense, implies that my function is processing a fairly simple REBOL based
grammar (ie dialect). I don't think it is too outlandish to say every
function that processes a block! is implementing grammar processing even if
trivial. Hence, the subject title of this message.
As a side note, I reckon every offline script can be considered as an out of
memory block! ready for loading.
Anyway, this thought leads me to PARSE, REBOL's grammar validator. So I
recoded my function to look something like this:
f2: function [
[catch]
block [block!]
] [kount value] [
kount: 0
if not parse block [
any [set value number! (kount: kount + 1)] |
set value any! to end
] [throw make error! "F2 can only process numbers."]
]
Then I did some quick and dirty testing to see which was faster. On my
machine, F2 takes half the time that F1 does.
Just thought I'd share my little ramble. Test code below.
Cheers,
Brett.
REBOL []
data: copy []
repeat i 2000 [insert tail data i]
f1: function [
[catch]
block [block!]
] [kount] [
kount: 0
repeat value block [
if not number? value [
throw make error! "F1 can only process numbers."
]
kount: kount + 1
]
]
f2: function [
[catch]
block [block!]
] [kount value] [
kount: 0
if not parse block [
any [set value number! (kount: kount + 1)] |
set value any! to end
] [throw make error! "F2 can only process numbers."]
]
timeit: func [f] [
t1: now/precise
repeat i 300 [error? try [f data]]
t2: now/precise
t2/time - t1/time
]
print ["f1" timeit :f1]
print ["f2" timeit :f2]
insert at data 1001 'a-word
print ["f1 - error half way" timeit :f1]
print ["f2 - error half way" timeit :f2]
---
Website: http://www.codeconscious.com
Rebsite: http://www.codeconscious.com/index.r
[4/12] from: brett:codeconscious at: 12-Sep-2002 23:32
Hi Ladislav,
I was less than clear. The kount stuff was to simulate "other processing".
The value needs to be set for processing and to report the type in the error
message. My mistake for over simplifying the examples, and for stuffing them
up. Here is the actual function I was working on:
bayes: function [
{Calculate combined probability.}
[catch] probabilities [any-block!]
] [p0 p1 d] [
p0: p1: 1.0
if not parse probabilities [
any [
set value number! (p0: value * p0 p1: 1 - value * p1) |
set value any-type! to end skip
]
] [throw make error! reduce ['script 'cannot-use 'bayes mold type?
get/any 'value]]
if zero? d: add p0 p1 [throw make error! "The probabilities cannot
be combined."]
divide p0 d
]
Sorry for the confusion.
Brett.
[5/12] from: g:santilli:tiscalinet:it at: 12-Sep-2002 15:26
Hi Brett,
On Thursday, September 12, 2002, 2:05:39 PM, you wrote:
BH> if not parse block [
BH> any [set value number! (kount: kount + 1)] |
BH> set value any! to end
BH> ] [throw make error! "F2 can only process numbers."]
The second part of the rule will never be reached. ANY does not
fail if it matches 0 elements.
In your trivial example, your code could look like:
if not parse block [any number!] [ ...
provided you are accepting empty blocks as input too. Of course, I
imagine your example was actually cut down form something bigger,
that needed the SET VALUE ... and the count (which you could maybe
do faster by just using LENGTH?...).
Regards,
Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r
[6/12] from: greggirwin:mindspring at: 12-Sep-2002 9:56
Brett, et al
A very good post, and related to the discussion on error handling to boot!
:)
I can see at least two major ways to apply this approach.
1) Parse drives the processing of the function, automatically validating the
correctness of the argument block as it goes.
2) Parse is used only to validate the block at the entry point, like a
precondition in a Design by Contract design, and the operation of the
function is separate. You could also use parse to validate return values
(i.e. as a post-condition processor).
In a large system, and as more things are driven by dialects, this could
prove very useful indeed. Now, does anyone have a model they use to return
helpful context information when a parse operation fails? Programmers are OK
with "Syntax error: expected integer!" but users are probably better served
by "I understood the first part ("sell 400 shares of MSFT at"), but then I
was expecting to see a monetary value for the sell price (e.g. $50.00), and
I didn't, so I couldn't process your request. I'm sorry for any
inconvenience this might cause you. Have a nice day." Or maybe just "No sell
price was specified in your request." or even a prompt for the missing info.
In any case, I agree that this is a good tool to keep in mind.
--Gregg
[7/12] from: lmecir:mbox:vol:cz at: 12-Sep-2002 18:35
Thanks.
-L
----- Original Message -----
From: "Brett Handley"
...
Here is the actual function I was working on:
bayes: function [
{Calculate combined probability.}
[catch] probabilities [any-block!]
] [p0 p1 d] [
p0: p1: 1.0
if not parse probabilities [
any [
set value number! (p0: value * p0 p1: 1 - value * p1) |
set value any-type! to end skip
]
] [throw make error! reduce ['script 'cannot-use 'bayes mold type?
get/any 'value]]
if zero? d: add p0 p1 [throw make error! "The probabilities cannot
be combined."]
divide p0 d
]
Sorry for the confusion.
Brett.
[8/12] from: reffy:ulrich at: 12-Sep-2002 19:56
Watching the thread on error handling ...
Giving useful information seems to me to be at least 2 parts:
1. A specific operation fails - need to have information which readily reveals domain
error, length error, mismatched operand types, etc.
2. Contextually, it might be important for the developer to control what gets displayed
for tracking purposes.
Since having one scheme (as the language developer) that suits everyone's taste is very
difficult, maybe a compromise could be had?
As a developer, I would like to be able to place begin/end markers in my code.
The purpose of the markers is to signal the language execution engine the type/degree
of capturing trace information. I can imagine several fine/coarse grained levels. The
coarseness of the tracing could be at the function level, the statement level, maybe
even other levels.
I like this approach because I probably don't want the execution engine burdened with
something which might yield an overall performance hit. I/you can probably be the best
judge of when/what we need information about. It would be up to us to cause the performance
hit of tracing/tracking flow of control.
Comments?
Dick
[9/12] from: joel:neely:fedex at: 13-Sep-2002 7:50
Hi, Dick,
You've given me an idea... (Thanks!)
[reffy--ulrich--net] wrote:
> Watching the thread on error handling ...
>
...
> As a developer, I would like to be able to place begin/end markers
> in my code. The purpose of the markers is to signal the language
<<quoted lines omitted: 7>>
> we need information about. It would be up to us to cause the
> performance hit of tracing/tracking flow of control.
This is a *real* QAD, but the idea was to write something that would
let me "enclose" a block with a marker so that any error occurring
within that block would generate a display of all currently active
markers as well as the error message itself.
Again, this is a quick hack with much room for improvement...
Here's the error zone manager, in %errzone.r:
8<----------
errzone: make object! [
zones: []
init: func [] [zones: copy []]
new: func [/local newself] [
newself: make self []
newself/init
newself
]
nest: func [s [string!] b [block!] /local err] [
insert tail zones s
either error? set/any 'err try b [
print ["^/Error within"]
foreach zone zones [print ["^-" zone]]
print ["was^/" mold disarm err newline]
halt
][
remove back tail zones
]
]
]
8<----------
Here are some error-prone routines that use ERRZONE, either
recursively or in nested function evaluations:
8<----------
do %errzone.r
irecurrence: func [n [integer!] /local rzone _recurrence] [
rzone: errzone/new
do _recurrence: func [
p [integer!] n [integer!]
/local result
][
rzone/nest rejoin ["tracing: " p ", " n] [
result: _recurrence p / n n - 1
]
result
] 1000 n
]
nrecurrence: func [n [integer!] /local rzone _recurrence] [
rzone: errzone/new
do _recurrence: func [
p [number!] n [integer!]
/local result
][
rzone/nest rejoin ["tracing: " p ", " n] [
result: _recurrence p / n n - 1
]
result
] 1000 n
]
gzone: errzone/new
eenie: func [n [integer!]] [
gzone/nest "eenie" [
loop random 5 [print "blah "]
print 1 / n
loop random 5 [print "yak "]
]
]
meenie: func [n [integer!]] [
gzone/nest "meenie" [
loop random 5 [print "blah "]
eenie n
loop random 5 [print "yak "]
]
]
meinie: func [n [integer!]] [
gzone/nest "meinie" [
loop random 5 [print "blah "]
meenie n
loop random 5 [print "yak "]
]
]
moe: func [n [integer!]] [
gzone/nest "moe" [
loop random 5 [print "blah "]
meinie n
loop random 5 [print "yak "]
]
]
8<----------
Here are some annotated results of using the above "clients":
8<----------
>> irecurrence 5
Error within
tracing: 1000, 5
tracing: 200, 4
tracing: 50, 3
was
make object! [
code: 303
type: 'script
id: 'expect-arg
arg1: '_recurrence
arg2: 'p
arg3: [integer!]
near: [result: _recurrence p / n]
where: 'nest
]
8<----------
I included this one because it surprised me at first. (I didn't
read the error object carefully enough!) The value of 50 / 3 is
not integral, so the attempt to invoke _RECURRENCE failed because
of an argument type error.
8<----------
>> nrecurrence 5
Error within
tracing: 1000, 5
tracing: 200, 4
tracing: 50, 3
tracing: 16.6666666666667, 2
tracing: 8.33333333333333, 1
tracing: 8.33333333333333, 0
was
make object! [
code: 400
type: 'math
id: 'zero-divide
arg1: none
arg2: none
arg3: none
near: [result: _recurrence p / n]
where: 'nest
]
8<----------
Our classic poster-child error...
8<----------
>> moe 1
blah
blah
blah
blah
blah
blah
blah
blah
blah
blah
blah
blah
1
yak
yak
yak
yak
yak
yak
yak
yak
yak
yak
yak
yak
yak
yak
= []
>> moe 0
blah
blah
blah
blah
blah
blah
blah
blah
blah
blah
blah
blah
blah
Error within
moe
meinie
meenie
eenie
was
make object! [
code: 400
type: 'math
id: 'zero-divide
arg1: none
arg2: none
arg3: none
near: [print 1 / n
loop]
where: 'nest
]
8<----------
The first call to MOE succeeded (producing lots of simulated
work and output), but the second failed four levels of function
evaluation down.
As I said above, this is a bit kludged up, but serves as a proof
of concept. For example, we could define a new function-defining
word something like this
safe-foo: zonefunc zoneobj "FOO Zone" [...args...] [...body...]
in contrast with
foo: func [...args...] [...body...]
to wrap the body of the newly-defined function with an appropriate
use of the ERRZONE concept.
-jn-
--
; Joel Neely joeldotneelyatfedexdotcom
REBOL [] do [ do func [s] [ foreach [a b] s [prin b] ] sort/skip
do function [s] [t] [ t: "" foreach [a b] s [repend t [b a]] t ] {
| e s m!zauafBpcvekexEohthjJakwLrngohOqrlryRnsctdtiub} 2 ]
[10/12] from: brett:codeconscious at: 13-Sep-2002 22:13
Hi Gregg,
> A very good post, and related to the discussion on error handling to boot!
Thanks!
> In a large system, and as more things are driven by dialects, this could
> prove very useful indeed.
It is probably overkill for some situations, on the other hand you never
know when you might want that one-off to be included in some large
undertaking. :^)
> In a large system, and as more things are driven by dialects, this could
> prove very useful indeed. Now, does anyone have a model they use to return
> helpful context information when a parse operation fails? Programmers are
OK
> with "Syntax error: expected integer!" but users are probably better
served
> by "I understood the first part ("sell 400 shares of MSFT at"), but then I
> was expecting to see a monetary value for the sell price (e.g. $50.00),
and
> I didn't, so I couldn't process your request. I'm sorry for any
> inconvenience this might cause you. Have a nice day." Or maybe just "No
sell
> price was specified in your request." or even a prompt for the missing
info.
I think you've got at least three issues there. One is relating to speaking
the user's language.
The second is determing exactly where in the input the error is - which is
more difficult.
Third is showing what is expected next.
Your "I understood the first part ..." is a good solution for showing the
user where processing got to.
As for the expecting bit. I don't think there is a general solution. Having
to tell the user "error...expecting money" adds an application requirement
making a bigger application. Each "expecting X" becomes an alternative rule
to the rule that is attempting a match. So your rules bloat out quite a bit
if you want to report on every grammar term. For some apps maybe it is
enough to say "Recognised two stock orders. Could not recognise the rest of
the instructions."
Regards,
Brett.
[11/12] from: greggirwin:mindspring at: 13-Sep-2002 12:52
Hi Brett,
<< As for the expecting bit. I don't think there is a general solution.
Having
to tell the user "error...expecting money" adds an application requirement
making a bigger application. Each "expecting X" becomes an alternative rule
to the rule that is attempting a match. So your rules bloat out quite a bit
if you want to report on every grammar term. For some apps maybe it is
enough to say "Recognised two stock orders. Could not recognise the rest of
the instructions." >>
I think I didn't say that I thought more than I said. :)
What I was *thinking*, but didn't write in my message, was how many parsers
provide context information when a syntax error occurs and that PARSE might
be able to make available that kind of information so we don't have to do it
all ourselves at the individual rule level. F'rinstance:
>> add 1
** Script Error: add expected value2 argument of type: number pair char
money date time tuple
** Near: add 1
REBOL has rules about how to parse code based on the spec block (the rule)
for a function, no? So, based on the rule(s) that have succeeded, you know
what state the parser was in when it hit a rule that failed, and what that
rule (or rules) was.
I like the way it works today, not popping an error directly to the user,
but you have no idea exactly where the parse failed unless, as you point
out, you add stuff to all your rules to track exactly what's going on. Maybe
that ends up being a viable alternative in some cases, but if PARSE gave us
some more clues, it might be useful.
Another problem I have is that I start thinking way ahead of myself about
what could be
. I can see great potential for higher level dialects driving
PARSE; generating rules dynamically (including specific constraints) based
on data from a database, previous messages in a conversation, etc.; and
tools to help visualize and debug dialects. I just about jump out of my seat
when I think about some of the possibilities. :)
--Gregg
[12/12] from: brett:codeconscious at: 14-Sep-2002 10:15
Hi Gregg,
> What I was *thinking*, but didn't write in my message, was how many
parsers
> provide context information when a syntax error occurs and that PARSE
might
> be able to make available that kind of information so we don't have to do
it
> all ourselves at the individual rule level. F'rinstance:
I think I understood what the intent of your message was and even was going
to suggest a new keyword to assist with the context info, but when I thought
of some of the parse rules I'd written I could not imagine how any context
information issued from them would help me produce a meaningful response to
the user. So I unimaginatively said I don't think there is a general
solution. :^)
> Another problem I have is that I start thinking way ahead of myself about
> "what could be". I can see great potential for higher level dialects
driving
> PARSE; generating rules dynamically (including specific constraints) based
> on data from a database, previous messages in a conversation, etc.; and
> tools to help visualize and debug dialects. I just about jump out of my
seat
> when I think about some of the possibilities. :)
Hardly a problem. Perhaps just not always immediately measurably productive.
On the other hand such thinking is part of the creative search I'm sure.
Regards,
Brett.
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted