[REBOL] Re: Processing files / Any AWK users?
From: greggirwin:mindspring at: 19-Nov-2001 11:34
Thanks Joel!
<BIG SNIP>
It would be marginally useful to me as a keystroke-saver, but a
brute-force equivalent in REBOL would (IMHO) only result in
replacing something vaguely like
awk-like-func: func [one-line [string!] ...] [
if parse/all/case one-line [ ;...first parse rule...
][
;...corresponding action...
exit
]
if parse/all/case one-line [ ;...next parse rule...
][
;...corresponding action...
exit
]
;...
if parse/all/case one-line [ ;...last parse rule...
][
;...corresponding action...
exit
]
;...error or inaction...
]
foreach myline read myfile [awk-like-func myline]
with
do %awklib.r
awk-plan: [
[ ;...local words for rules/actions]
[ ;...first parse rule...] [ ;...corresponding action...]
[ ;...next parse rule... ] [ ;...corresponding action...]
;...
[ ;...last parse rule... ] [ ;...corresponding action...]
]
awk-main myfile awk-plan
which would be handy but not a huge win. I guess it just depends on
whether most of one's parsing/action tasks are line oriented or not.
However, I don't think AWK-PLAN would be too challenging to write.
<END BIG SNIP>
Right. In order to be useful it should provide code savings or help to make
programs clearer and more self-documenting.
The first thing it does is save you writing two foreach loops, one for the
list of files and the other for the lines in each file.
>From a processing efficiency standpoint, it parses each line once and then
evaluate each rule against that parsed representation. If you do this
manually, you save a little more code.
It gives you a few "standard" rules that make it clear what certain actions
are used for:
begin [print "before any processing begins"]
end [print "after all processing is done"]
all [print "do this for every line in each file"]
I've thought about adding these as well:
begin-file [print "before we process the first line in each file"]
end-file [print "after we process the last line in each file"]
It does a little housekeeping for you and gives you shorthand references to
things like the current record, individual fields in a record, number of
lines read, field separator, record separator, etc.
That's the basic stuff, which works right now. Rules and actions are just
how you showed them (with rules being any singular entity: word!, block!,
paren!) though I hadn't thought about local word definitions.
The biggest omissions right now are probably regular expression support and
automatic conversion of numeric values in fields. I have a dictionary object
that could be plugged in to emulate associative arrays so that's not an
issue.
It's kind of handy in its current form (sort of a crude dialct I guess), but
would need some work IMO, to be a good general purpose tool and provide
value above just using REBOL.
Thanks for the feedback!
--Gregg