[REBOL] Re: FW: Re: [append][series]Appending to a series of strings
From: joel:neely:fedex at: 20-Nov-2003 13:22
Hi, Ladislav,
Actually, it's not faster (for sufficiently large cases). See below.
Ladislav Mecir wrote:
> Hi, my solution using Parse (I think, that it is much faster, than
> other solutions):
>
I did a bit of benchmarking with functions that use each of the three
strategies to generate a block of answers (value/count pairs). I'll
include those functions at the end, in case anyone wants to verify
that I didn't mangle any code.
Using a SCORES block of random numbers between 0 and 100 (inclusive),
the iterative version scales up better than the parse-based version
as the size of the SCORES block increases (all times in seconds):
size iterative remove-each parse-based
1000 1E-2 0.28 0
10000 0.351 2.874 0.24
100000 3.024 36.813 4.637
200000 4.787 -- 4.556
300000 6.74 -- 6.94
400000 8.832 -- 14.371
500000 11.887 -- 18.517
1000000 21.891 -- 39.227
I gave up VERY quickly on the remove-each-based version; it gets
eaten alive by memory management overhead.
Since the parse-based version sorts (a copy of) the scores block,
its time complexity must be at least O (n log n).
The iterative version is only O (n), so it will be faster for
sufficiently large n.
I should also point out that the iterative version requires only
one value at a time; it can work on an arbitrarily large set of
values (e.g., being retrieved across a network connection, read
from a huge data file, resulting from computation in a loop, etc.),
but the remove- and parse-based versions require the entire set
to be available for sorting.
There's one final point, but I'll post it separately.
-jn-
Ladislav Mecir wrote:
> scores: clear []
> loop 30 [append scores random 20]
> group: [p: set i integer! any i q: (print ["score:" i "tallies:"
> offset? p q])]
> parse probe sort scores [any group]
>
>>-----Original Message-----
>>From: Anton Rolls [mailto:[antonr--iinet--net--au]]
>>
>>; initialize some random scores
>>scores: clear []
>>loop 30 [append scores random 20]
>>
>>; figure out how many of each score
>>tallies: clear []
>>foreach uscore sort unique scores [
>> append/only tallies reduce [
>> uscore
>> length? remove-each score copy scores [uscore <> score]
>> ]
>>]
>>
SOURCE CODE FOR TIMED FUNCTIONS IS GIVEN BELOW:
;iterative version
tally-i: func [
scores [block!]
/local tallies result
][
tallies: copy []
foreach score scores [
either found? here: select tallies score [
here/1: here/1 + 1
][
insert tail tallies reduce [score copy [1]]
]
]
result: make block! length? tallies
foreach [score tally] sort/skip tallies 2 [
insert tail result score
insert tail result tally
]
result
]
;remove-each-based version
tally-r: func [
scores [block!]
/local tallies
][
tallies: clear []
foreach uscore sort unique scores [
append/only tallies reduce [
uscore
length? remove-each score copy scores [uscore <> score]
]
]
]
;parse-based version
tally-p: func [
scores [block!]
/local group p q result
][
result: copy []
group: [
p: set i integer! any i q: (
insert tail result i
insert tail result offset? p q
)
]
parse sort copy scores [any group]
result
]
--
----------------------------------------------------------------------
Joel Neely joelDOTneelyATfedexDOTcom 901-263-4446
Enron Accountingg in a Nutshell: 1c=$0.01=($0.10)**2=(10c)**2=100c=$1