[REBOL] Re: Sort by first part of line
From: rotenca:telvia:it at: 8-Sep-2002 2:15
Hi Sunanda,
> For this and other reasons, I have some doubts about Rebol's ability to run
> 24x7 while pounding loads of data. Any really big designs out there should
> look closely at this issue.
I am not sure. For example, at least under 1.2.5.3.1, my two tests after some
oscillations do not change memory allocation (and without any recycle
command).
Your test is stable like rock in memory allocation without recycle. I did not
try other tests.
Often (i think always) continue memory increases are only the result of a bug
in user code (blocks and strings are the first to check).
To end, in my tests i did not see any variation in the timing of your test
and up to date we are not sure that memory allocation continue to grow when we
repeat all the tests.
> Thanks for your timings. I get (on my machine) a run time of about 6.5
> seconds. 25% faster than Gregg's but still some way to go to beat Scott and
> Joel. It's that parse again. You only do one compared to Gregg's 750,000 (or
> so). I guess it's not the number of invokations: it's the total data that
has
> to be parsed than counts.
I suspect (i'm sure :-) you use a not-beta Rebol release. I have similar
results under 1.2.3.1.3.
Automatic Rebol memory allocation until 1.2.3 is buggy and to limit problems,
blocks need to be preallocated with make block!.
This is another round of test under 1.2.5.3.1 with your code also, which i
forgot last time:
1.2.5.3.1 Celeron 333 RAM 128 Mb Window 98 first edition
reading data : 0:00:00.22 memory allocated: 6465424
Sunanda : 0:00:45.92 memory allocated: 9435920
Scott : 0:00:02.8 memory allocated: 13418112
Carl : 0:00:02.09 memory allocated: 13586848
Romano : 0:00:01.37 memory allocated: 15895152
Romano-int : 0:00:01.43 memory allocated: 18477920
Joel : 0:00:01.54 memory allocated: 23200928
got same results
done
And this is a test under 1.2.3.1 with my code optimized for this version (code
follows):
1.2.1.3.1 Celeron 333 RAM 128 Mb Window 98 first edition
reading data : 0:00:00.22 memory allocated: 6071976
Sunanda : 0:00:59.1 memory allocated: 8868200
Scott : 0:00:11.26 memory allocated: 12222608
Carl : 0:00:03.84 memory allocated: 12563472
Romano : 0:00:01.37 memory allocated: 14783712
Romano-int : 0:00:01.38 memory allocated: 17348192
Joel : 0:00:03.85 memory allocated: 18710144
got same results
done
Here i'm more fast than Joel. But also Joel code can be optimized.
-----------code-----------
;; Romano -- parse sort for 1.2.1.3.1
;; -------------------
data: read %../public/www.pusatberita.com/test.txt
report-item/start "Romano"
sorted-data: make block! 80000 ;changed for 1.2.1.3.1
parse/all data [
some [
copy num [any " " to " "]
copy rest thru newline
(insert insert tail sorted-data num rest)
]
]
sort/skip sorted-data 2
report-item/end "Romano"
write %sorted-data-romano.txt sorted-data
unset 'sorted-data
recycle
;; Romano -- parse sort int for 1.2.1.3.1
;; -------------------
data: read %../public/www.pusatberita.com/test.txt
report-item/start "Romano-int"
blk: make block! 80000 ;changed for 1.2.1.3.1
parse/all data [
some [
h: any " " copy num integer!
:h copy rest thru newline
(insert insert tail blk to integer! num rest)
]
]
sort/skip blk 2
report-item/end "Romano-int"
sorted-data: copy ""
foreach [x x2] blk [insert tail sorted-data x2]
write %sorted-data-romano-int.txt sorted-data
unset 'sorted-data
recycle
; for next texts:
data: read/lines %../public/www.pusatberita.com/test.txt
-----end of code ----
---
Ciao
Romano