[append][series]Appending to a series of strings

[1/7] from: joel::neely::fedex::com at: 18-Nov-2003 7:19

Hi Seth, See below... Seth wrote:

> >> a: [1 2 3 4 5] > == [1 2 3 4 5]

<<quoted lines omitted: 4>>

> >> append b/:x "hi" > == "hi"

At this point, take a look at the value of B to see why:

>> b

== ["hi"]

>> fourth b

** Script Error: Out of range or past end ** Near: fourth b

>> >> b/:x

== none

>> > >> x: index? find a 4 > == 4

At this point B has no fourth element (ergo B/:X returns NONE) so...

> >> append b/:x "hi" > ** Script Error: append expected series argument of type: series port > ** Near: append b/:x "hi" > > In theory, shouldn't append do it's thing on b/4 ... What's with the > error? :\ >

...you can't treat that (non-existent) element as a series. Unlike Perl, which automatically allocates and meaningfully initializes previously non-existent data, REBOL requires that a value exist and be of the correct type for whatever operation you attempt to perform on it. -jn-

[2/7] from: seth:chromick:earthlink at: 18-Nov-2003 9:53

Joel Neely wrote:

>Hi Seth, >See below...

<<quoted lines omitted: 38>>

>of the correct type for whatever operation you attempt to perform on it. >-jn-

Thanks everyone... I was coming from a Perl point of view here -- The REBOL way is a lot more logical -- This is what I get for coding at ungodly hours in the morning ;] Thanks... :D

[3/7] from: joel:neely:fedex at: 18-Nov-2003 11:07

Hi, Seth, Seth wrote:

> Joel Neely wrote: >>...you can't treat that (non-existent) element as a series.

<<quoted lines omitted: 5>>

> REBOL way is a lot more logical -- This is what I get for coding at > ungodly hours in the morning ;] Thanks... :D

IMHO neither more nor less logical, just differently logical. Suppose one has a collection of small natural numbers (such as test scores ranging from 0 to 100) and one wants to know how many occurrences of each distinct number there are. Using Perl arrays: # assume @scores contains the raw data with dups @tallies = (); foreach $score (@scores) { ++$tallies[$score]; } foreach $score (0..$#tallies) { print "$score: $tallies[$score]\n" if $tallies[$score]; } Using REBOL blocks: ; assume SCORES contains the raw data with dups tallies: [] foreach score scores [ insert/dup tail tallies 0 score + 1 - length? tallies change at tallies score + 1 1 + pick tallies score + 1 ] forall tallies [ if 0 < tallies/1 [print [-1 + index? tallies ":" tallies/1]] ] or ; assume SCORES contains the raw data with dups tallies: [] foreach score scores [ either found? here: select tallies score [ here/1: here/1 + 1 ][ append tallies reduce [score copy [1]] ] ] foreach [score tally] sort/skip tallies 2 [ print [score ";" tally/1] ] REBOL is much more "literal"; there are no values that one does not explicitly create (although it is possible to be implicitly explicit at times ;-). On the other hand, it is necessary explicitly to manage details that aren't at the same logical level as the original problem (making sure that there enough "places" to store the next tally needed, etc). I'd be interested in any *self-contained* solutions to the above task that might be clearer than the above. -jn- -- ---------------------------------------------------------------------- Joel Neely joelDOTneelyATfedexDOTcom 901-263-4446 Enron Accountingg in a Nutshell: 1c=$0.01=($0.10)**2=(10c)**2=100c=$1

[4/7] from: antonr:iinet:au at: 20-Nov-2003 16:11

[5/7] from: lmecir:mbox:vol:cz at: 20-Nov-2003 12:14

Hi, my solution using Parse (I think, that it is much faster, than other solutions): scores: clear [] loop 30 [append scores random 20] group: [p: set i integer! any i q: (print ["score:" i "tallies:" offset? p q])] parse probe sort scores [any group] Anton Rolls napsal(a):

[6/7] from: brett:codeconscious at: 20-Nov-2003 22:43

Hi Ladislav,

> Hi, my solution using Parse (I think, that it is much faster, than other > solutions):

<<quoted lines omitted: 3>>

> offset? p q])] > parse probe sort scores [any group]

That was so lateral I fell over. Just brilliant. Regards, Brett.

[7/7] from: joel:neely:fedex at: 20-Nov-2003 13:22

Hi, Ladislav, Actually, it's not faster (for sufficiently large cases). See below. Ladislav Mecir wrote:

> Hi, my solution using Parse (I think, that it is much faster, than > other solutions): >

I did a bit of benchmarking with functions that use each of the three strategies to generate a block of answers (value/count pairs). I'll include those functions at the end, in case anyone wants to verify that I didn't mangle any code. Using a SCORES block of random numbers between 0 and 100 (inclusive), the iterative version scales up better than the parse-based version as the size of the SCORES block increases (all times in seconds): size iterative remove-each parse-based 1000 1E-2 0.28 0 10000 0.351 2.874 0.24 100000 3.024 36.813 4.637 200000 4.787 -- 4.556 300000 6.74 -- 6.94 400000 8.832 -- 14.371 500000 11.887 -- 18.517 1000000 21.891 -- 39.227 I gave up VERY quickly on the remove-each-based version; it gets eaten alive by memory management overhead. Since the parse-based version sorts (a copy of) the scores block, its time complexity must be at least O (n log n). The iterative version is only O (n), so it will be faster for sufficiently large n. I should also point out that the iterative version requires only one value at a time; it can work on an arbitrarily large set of values (e.g., being retrieved across a network connection, read from a huge data file, resulting from computation in a loop, etc.), but the remove- and parse-based versions require the entire set to be available for sorting. There's one final point, but I'll post it separately. -jn- Ladislav Mecir wrote:

> scores: clear [] > loop 30 [append scores random 20]

<<quoted lines omitted: 17>>

>>] >>

SOURCE CODE FOR TIMED FUNCTIONS IS GIVEN BELOW: ;iterative version tally-i: func [ scores [block!] /local tallies result ][ tallies: copy [] foreach score scores [ either found? here: select tallies score [ here/1: here/1 + 1 ][ insert tail tallies reduce [score copy [1]] ] ] result: make block! length? tallies foreach [score tally] sort/skip tallies 2 [ insert tail result score insert tail result tally ] result ] ;remove-each-based version tally-r: func [ scores [block!] /local tallies ][ tallies: clear [] foreach uscore sort unique scores [ append/only tallies reduce [ uscore length? remove-each score copy scores [uscore <> score] ] ] ] ;parse-based version tally-p: func [ scores [block!] /local group p q result ][ result: copy [] group: [ p: set i integer! any i q: ( insert tail result i insert tail result offset? p q ) ] parse sort copy scores [any group] result ] -- ---------------------------------------------------------------------- Joel Neely joelDOTneelyATfedexDOTcom 901-263-4446 Enron Accountingg in a Nutshell: 1c=$0.01=($0.10)**2=(10c)**2=100c=$1

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted