Mailing List Archive: Re: REBOL embedded $variables regexp/no

[REBOL] Re: REBOL embedded $variables regexp/no

From: joel::neely::fedex::com at: 9-Dec-2002 13:45


Hi, Tom,

Responding with equal good will...  ;-)

Tom Conlin wrote:
> no offense but,
> For me never having to see another regexp is a feature.
> Before this goes further I would like to see anything that
> can be done with regexps that can't be done with parse
> and being short & opaque doesn't count.
>

Calling something "opaque" sounds (to my ear, at least) like more of
a value judgement than an objective description.  However, it's quite
easy to look at two ways of expressing an idea to see if one is
significantly more "short" than the other.

So... to respond to your inquiry, if we assume that two notations
(such as Perl and REBOL) are both Turing-complete, then anything that
can be done
 with one can be done with the other.  However, that's
also true of assember...  ;-)

Some time back we had a long discussion about removing redundant
whitespace from a string.  This is an area where the "search-and-
replace" usage of regular expressions provides some real notational
economy, IMHO.

    $bigstring =~ s/\s+/ /g;

Please, everyone, before complaining about the punctuation, get over
it and look at the real point; that expression says (compactly) the
equivalent of this:

    Within the variable named "bigstring" replace all runs of
    whitespace (*) with a single blank.

    (*)  technically it says "one or more consecutive whitespace
         characters", which I feel justified in verbalizing as
         "runs of whitespace".

The point is that, having learned this pattern, it generalizes quite
nicely, so that

    $bigstring =~ s/\d+/#/g;

means:

    Within the variable named "bigstring" replace all runs of
    digits with a single pound-sign/number-sign/octothorp.

Please, everyone, before jumping to either of the conclusions that

1)  regular expressions are ugly, and/or
2)  Perl is ugly

and therefore dismissing either from thought, take the time/effort to
write equivalent REBOL for the above descriptions.  Then think about
what's similar and different in the solutions for those two tasks,
how much code had to be written for each, how easy it was to create
one by re-using as much as possible from the other, etc.

Then think about a slightly more interesting case, such as

    $bigstring =~ s/([A-Z][a-z]{2}) (\d{1,2}), (\d{4})/$2-$1-$3/g;

Again, let's keep the focus on the real question: how much trouble is
it to transform a string based on searching for generalized patterns
and replacing each occurrence with something based on the pattern that
was found.  By defining suitable "helper" variables, it's perfectly
possible to write this as

    $bigstring =~ s/($month) ($day), ($year)/$2-$1-$3/g;

instead, but that's up to the programmer.  In either case, with a bit
of wrapper e.g. to read from a file and print the results of the
replacement, we turn text that looks like this

    On Dec 09, 2002 I wrote an email that talked
    about modifying strings based on a match-and-
    replace strategy.  This was in response to
    messages posted in the REBOL mailing list on
    Dec 07, 2002 and Dec 08, 2002.

into text that looks like this

    On 09-Dec-2002 I wrote an email that talked
    about modifying strings based on a match-and-
    replace strategy.  This was in response to
    messages posted in the REBOL mailing list on
    07-Dec-2002 and 08-Dec-2002.

Again, I think the real question is how much code the REBOL programmer
has to write to get the equivalent transformation.

As PARSE is an all-or-nothing affair, I believe that it currently
requires the programmer to work harder to do these kinds of tasks.
(And they occur often enough in the kinds of programming that I do
that I find that difference in effort significant to my productivity.)

-jn-

--
----------------------------------------------------------------------
Joel Neely            joelDOTneelyATfedexDOTcom           901-263-4446