Mailing List Archive: Re: Parse versus Regular Expressions

[REBOL] Re: Parse versus Regular Expressions

From: g:santilli:tiscalinet:it at: 4-Apr-2003 15:54


Hi Joel,

On Friday, April 4, 2003, 2:57:22 PM, you wrote:

JN>     The string containing "b" or the string containing "ba".

Well, indeed it is not very intuitive, but the real meaning is:

        Try to match the rule "b". If id does not match, try to
        match the rule "ba".

PARSE  then  returns  TRUE  or FALSE depending on whether the rule
matched the whole string or not (i.e. whether the current position
after the end of the matching is the tail of the string or block).

JN> 1)  It makes the PARSE dialect a strange hybrid between declarative
JN>     and procedural descriptions (as mentioned in my earlier post).

The PARSE dialect is something that is matched to the input string
or  block  from  left  to  right. I'm not sure if this makes it an
hybrid;  probably  it  does, and I admit I like it this way. (Much
better than RE, that is.)

JN> 2)  It totally breaks normal usage that "or" is symmetrical, even
JN>     though we read the  |  as "or".

Indeed,  |  has  not  the  intuitive  meaning  of OR. It is just a
marker:  if  PARSE  reaches it, then it stops matching the current
rule  block.  If  a  matching  fails,  then  PARSE looks to see if
there's  a  | in the current rule block and restarts matching from
after the |.

JN>         Is the sequence of letters "BA" accepted by a rule that
JN>         will match "B" or "BA"?

JN>     I believe that most people would say "Of course!"

A rule that returns TRUE for "B" or "BA" is:

        ["b" end | "ba" end]

or, if you want to avoid the END,

        ["ba" | "b"]

(i.e. having the longest-matching rule first).

JN>     AFAIK the only answers to "Why are these different?" would be
JN>     a)  a complicated description of the implementation, or

I  don't  think  so.  It  is  a  very  simple  description  of the
implementation,  because  the  implementation  is very simple. (On
contrast, RE parsers are usually very complex.)

Regards,
   Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]>  --  REBOL Programmer
Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r