[REBOL] Re: Sort by first part of line
From: carl:cybercraft at: 7-Sep-2002 11:16
On 07-Sep-02, Gregg Irwin wrote:
> Hi Scott,
> Good deal! I used your data generator and then did this:
> data: read/lines %data-1.txt
> compare-items: func [a b /local aa bb] [
> aa: to integer! first parse a none
> bb: to integer! first parse b none
> either aa = bb [0][either aa < bb [-1][1]]
> ]
> t: now/time/precise
> sort/compare data :compare-items
> print now/time/precise - t
> halt
> which gave me these results for 3 runs (on a P900 w/384 Meg of RAM):
> 1. 0:00:20.099
> 2. 0:00:20.269
> 3. 0:00:20.099
> Changing the comparisons to use Sunanda's approach (assuming
> equality is least likely): either aa < bb [-1][either aa > bb
> [1][0]]
> 1. 0:00:19.508
> 2. 0:00:19.528
> 3. 0:00:19.518
> Making the data a hash! didn't speed it up and trying to make it a
> list! didn't work. Even just a plain SORT on the list didn't work,
> and inserted 'end for some items. Haven't investigated.
You don't have to look far...
>> x: to-list [1 2 3]
== make list! [1 2 3]
>> sort x
== make list! [1 2 end]
>> head x
== make list! [1 2 end]
!
That's on the non-beta View. Do the beta REBOLs give the same
results?
Anyway, using Scott's data-file, here's my leaving-file-on-the-disk
approach. I expect it's slower than loading the whole file into
memory, but might be better for super-huge files...
REBOL []
data1: %data1.txt ; Set paths to where you want.
data2: %data2.txt
; Scott's data creation code...
if not exists? data1 [
loop 29688 [
write/append/lines data1 rejoin [
" "
random/only {1234567890}
random/only {1234567890}
random/only {1234567890}
" "
random {eawk acef 32233}
]
]
]
if exists? data2 [delete data2] ; *** Deletes previous runs!
; Sorting...
t: now/time/precise
file-index: copy []
file: open/lines data1
forall file [
append file-index reduce [copy/part file/1 4 index? file]
]
close file
sort/skip file-index 2
file: open/lines data1
foreach [code line] file-index [
write/append/lines data2 file/:line
]
close file
print now/time/precise - t
--
Carl Read