[REBOL] unbuffered file reads (large files)
From: tim-johnsons::web::com at: 23-Oct-2007 10:46
Hello: I'm processing some large text files - 100,000 lines or more. It would seem to
me that using 'open with the 'direct refinement would be the answer, but I'm seeing 1)buffering
2)problems terminating the read loop. What follows is a test rebol file and a little
text file to test. I've made this to run as CGI, so that the port dump is a little more
readable. I have further comments at the end. ;; rebol file - can run from command line
or as CGI #!/usr/bin/rebol -cs REBOL[] print "Content-Type: text/html^/" print <pre>
print "Read file with cache" inf: open/lines %test.txt while[not tail? inf][ print first
inf inf: next inf ] close inf ;; works fine, but is buffered print "Read file without
cache" inf: open/direct/lines %test.txt ;; help open says 'direct should be unbuffered
;; tail test fails immediately ;while[not tail? inf][ ; print first inf ; inf: next inf
; ] ;; use truth test of inf while[inf][ ?? inf ;; look at the 'state members, expecially
'inBuffer print first inf inf: next inf ] close inf ;; here is the input text file line
one line two line three line four line five ;; comments follow: 1)the tail test fails
in direct mode 2)the truth test for 'inf is not helpful either. 3)It looks to me like
direct mode *is* buffered after the first read 4)The termination test could be something
like if all[string? inf/states/inBuffer empty? inf/states/inbuffer][break] 5)But we still
have buffered input right? What do you all think? Thanks