Mailing List Archive: Re: Parse versus Regular Expressions

[REBOL] Re: Parse versus Regular Expressions

From: andreas:bolka:gmx at: 5-Apr-2003 4:03


Hi Joel, Ladislav,

Saturday, April 5, 2003, 1:01:10 AM, Joel wrote:

> Ladislav Mecir wrote:
>>
>> As I mentioned here, it is:
>>
>>     z: [#a z #b | none]
>>
>> , which is a rule, that cannot be written using Regular
>> Expressions.
>>

IMHO 'parse _should_ be a context-free grammar (CFG) parser. As the '|
debate shows 'parse does not (per default) behave in a completely CFG
conform way, it (at least) misses the symmetrical OR.

However, CFGs are a strict superset of Regular Expressions (RE) -
AFAIK every possible RE pattern can be expressed as CFG.

> OK.  The old context-free-vs-context-sensitive thing.

The interesting thing is, that this classic example is basically not
context sensitive
 but only a symmetrical production. If symmetry is
considered as an instance of "context-sensitivity" is up to the
reader, nevertheless, those symmetrical productions are something REs
cannot express.

> I'm wondering what happens if we make the problem just a little
> larger, and try to match any of

>     ""
>     "abc"
>     "aabbcc"
>     "aaabbbccc"

> and so on.  [...]

>     $somestring =~ /^(a*)(b*)(c*)$/ and
>        length ($1) == length ($2)   and
>        length ($2) == length ($3)

> and so on for four or more "pieces".

> How would you attack that with the PARSE dialect?

Are you not mixing the grammar description with "external" (at least
from the grammar's point of view) programming language facilities?

This mixing is of course possible with 'parse as well:

parse-r: func [ s /local r p1 p2 p3 ] [
    r: [ [copy p1 any "a"] [copy p2 any "b"] [copy p3 any "c"] ]
    all [ (parse s r) (= length? p1 length? p2) (= length? p2 length? p3) ]
]

This does not account for empty sentences, but I think this is really
not the point :) The example you gave is indeed context-sensitive and
as both, REs and CFGs have no real sense of context, "external"
facilities are required to establish the required context.

--
Best regards,
 Andreas                            mailto:[andreas--bolka--gmx--net]