[REBOL] Re: REBOL embedded $variables regexp/no
From: g:santilli:tiscalinet:it at: 10-Dec-2002 12:50
Hi Joel,
On Monday, December 9, 2002, 8:45:19 PM, you wrote:
JN> Please, everyone, before complaining about the punctuation, get over
JN> it and look at the real point; that expression says (compactly) the
JN> equivalent of this:
[...]
Joel, you are right, but also consider that if I look at it, I
don't understand what it is doing.
Of course, that applies to every language, but if you look at:
replace/all s "tetx" "text"
you can guess what it is doing even if you don't know REBOL at
all. (Or, if a couple months have passed since you touched it last
time, which is much more common and is the real point I wish to
take.)
This said, surely it would be nice to have pattern matching in
REBOL. Maybe using a different notation for RegExps that is a
little bit less vodoo. However, pattern matching has the
disadvantage of looking a simple step, while it can be very
computationally intensive; I prefer PARSE because it is very clear
how complex a rule is, computationally.
JN> Within the variable named "bigstring" replace all runs of
JN> digits with a single pound-sign/number-sign/octothorp.
Now, I don't claim this to be more readable, because you need to
provide a rule to match the text that does not match the pattern
rule, but I found it very easy to code it, and it looks very
reusable.
pattern-replace: func [string text-pattern match-pattern replacement /local result txt]
[
result: make string! length? string
parse/all string [
copy txt text-pattern (emit result txt)
any [match-pattern (emit result replacement) copy txt text-pattern (emit result txt)]
]
result
]
emit: func [dest value] [if value [append dest reduce value]]
>> digits: charset "1234567890"
>> chars: complement digits
>> pattern-replace "Replace 5248 with a #" [any chars] [some digits] "#"
== "Replace # with a #"
JN> Then think about a slightly more interesting case, such as
JN> $bigstring =~ s/([A-Z][a-z]{2}) (\d{1,2}), (\d{4})/$2-$1-$3/g;
If we don't care about the time it requires to do it,
>> not-rule: func [rule'] [use [rule mark] [rule: rule' copy/deep [some [mark: rule :mark
break | skip]]]]
(This requires the beta for the BREAK keyword. It is possible,
with a little more effort, to do the same without using BREAK.)
Then:
>> string: {
{ On Dec 09, 2002 I wrote an email that talked
{ about modifying strings based on a match-and-
{ replace strategy. This was in response to
{ messages posted in the REBOL mailing list on
{ Dec 07, 2002 and Dec 08, 2002.
{ }
>> ucase: charset [#"A" - #"Z"]
>> lcase: charset [#"a" - #"z"]
>> date: [copy month [ucase 2 lcase] " " copy day 1 2 digits ", " copy year 4 digits]
>> print pattern-replace string not-rule date date [day "-" month "-" year]
On 09-Dec-2002 I wrote an email that talked
about modifying strings based on a match-and-
replace strategy. This was in response to
messages posted in the REBOL mailing list on
07-Dec-2002 and 08-Dec-2002.
(You might argue that NOT-RULE is tricky; I agree, however once
you have included it in REBOL/Core you just need to use it.)
With a little more effort, one could write a faster rule for
matching the text that is not a date.
JN> Again, I think the real question is how much code the REBOL programmer
JN> has to write to get the equivalent transformation.
I don't think that it is too much. Of course, you could use
shorter words etc. in the above code to reduce the keystrokes. :-)
JN> As PARSE is an all-or-nothing affair, I believe that it currently
JN> requires the programmer to work harder to do these kinds of tasks.
Matching a pattern and defining a grammar are two different
things, of course. However, I am convinced that grammars are much
more general and useful than patterns.
IMHO,
Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r