[REBOL] Re: Parse versus Regular Expressions
From: g:santilli:tiscalinet:it at: 4-Apr-2003 15:54
On Friday, April 4, 2003, 2:57:22 PM, you wrote:
JN> The string containing "b" or the string containing "ba".
Well, indeed it is not very intuitive, but the real meaning is:
Try to match the rule "b". If id does not match, try to
match the rule "ba".
PARSE then returns TRUE or FALSE depending on whether the rule
matched the whole string or not (i.e. whether the current position
after the end of the matching is the tail of the string or block).
JN> 1) It makes the PARSE dialect a strange hybrid between declarative
JN> and procedural descriptions (as mentioned in my earlier post).
The PARSE dialect is something that is matched to the input string
or block from left to right. I'm not sure if this makes it an
hybrid; probably it does, and I admit I like it this way. (Much
better than RE, that is.)
JN> 2) It totally breaks normal usage that "or" is symmetrical, even
JN> though we read the | as "or".
Indeed, | has not the intuitive meaning of OR. It is just a
marker: if PARSE reaches it, then it stops matching the current
rule block. If a matching fails, then PARSE looks to see if
there's a | in the current rule block and restarts matching from
after the |.
JN> Is the sequence of letters "BA" accepted by a rule that
JN> will match "B" or "BA"?
JN> I believe that most people would say "Of course!"
A rule that returns TRUE for "B" or "BA" is:
["b" end | "ba" end]
or, if you want to avoid the END,
["ba" | "b"]
(i.e. having the longest-matching rule first).
JN> AFAIK the only answers to "Why are these different?" would be
JN> a) a complicated description of the implementation, or
I don't think so. It is a very simple description of the
implementation, because the implementation is very simple. (On
contrast, RE parsers are usually very complex.)
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r