[REBOL] 'extract (proposal)
From: shannon:ains:au at: 16-Dec-2000 21:58
I discovered a common need that the rebol 'parse, 'find and 'load
functions don't easily solve. That is to search a string! or block! for
a value of a particular datatype!. I think rebol needs a new native (or
mezzanine) function which I like to call 'extract:
USAGE:
EXTRACT series type /part range /all /tail /last /reverse /index
/custom rule
DESCRIPTION:
Finds a datatype in a series and returns the value(s) in a block.
Otherwise returns an empty block.
EXTRACT is an action value.
ARGUMENTS:
series -- (Type: series block port)
type -- (Type: datatype string block)
REFINEMENTS:
/part -- Limits the search to a given length or position.
range -- (Type: number series port)
/all -- Returns all matches in the series or block
/deep -- Searches within sub-strings and sub-blocks in the source
/last -- Backwards from end of series.
/reverse -- Backwards from the current position.
/index -- Returns a block containing the start and end index of the
match
/custom -- Allows custom datatypes to be matched
rule -- Specifies a rule for the custom datatype
examples:
>> extract "I have $10 in the bank!" money!
== [$10]
>> extract/all {<HTML><BODY>Some Text</BODY></HTML>} tag!
== [<HTML> <BODY> </BODY> </HTML>]
>>extract/all "String" char!
== [#"S" #"t" #"r" #"i" #"n" #"g"]
>> indexes: extract/index search-string: {Here is a "string" within a
string} string!
== [11 18]
foreach [start stop] indexes [prin search-string/start prin
search-string/stop]
>> extract/index ["string" 123.123.123.123 10x10] pair!
== [3 3]
>> digits: charset "0123456789"
>> won-id: ["<WON:" some digits ">"]
>> extract/custom {Killa<fred><123><WON:726372>} won-id
== ["<WON:726372>"]
Advanced example:
>> alpha: charset [#"A" - #"Z" #"a" - "z"]
>> digits: charset "0123456789"
>> name: [some alpha " " some alpha]
>> phone-number: [3 digits "-" 3 digits "-" 4 digits]
>> extract/custom/all phone-book [name phone-number]
== ["John Aalane" "333-245-2145" "Mary Absenabil" "435-245-5732" .....]
I would like to see some courageous rebol list-members attempt to write
source for this beast. I have written some myself that performs most of
the basic tasks outlined above but I don't want to contaminate the fresh
thinking of others by posting it now. I will post it soon after some
discussion on this topic.
Here are some issues for discussion:
Should 'extract return [], false or none! when it fails to find a match?
Are the refinements useful, do any clash, should more be added?
Should the functionality of 'extract be split between several
complementary functions to reduce complexity?
Should the syntax for custom rules be the same as for 'parse?
SpliFF
Librarian comment
Later versions of REBOL have a function called extract, but its purpose is different to this proposal. The built-in extract creates a block from an existing block by extracting every nth entry, eg: