Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

unbuffered file reads (large files)

 [1/3] from: tim-johnsons::web::com at: 23-Oct-2007 10:46


Hello: I'm processing some large text files - 100,000 lines or more. It would seem to me that using 'open with the 'direct refinement would be the answer, but I'm seeing 1)buffering 2)problems terminating the read loop. What follows is a test rebol file and a little text file to test. I've made this to run as CGI, so that the port dump is a little more readable. I have further comments at the end. ;; rebol file - can run from command line or as CGI #!/usr/bin/rebol -cs REBOL[] print "Content-Type: text/html^/" print <pre> print "Read file with cache" inf: open/lines %test.txt while[not tail? inf][ print first inf inf: next inf ] close inf ;; works fine, but is buffered print "Read file without cache" inf: open/direct/lines %test.txt ;; help open says 'direct should be unbuffered ;; tail test fails immediately ;while[not tail? inf][ ; print first inf ; inf: next inf ; ] ;; use truth test of inf while[inf][ ?? inf ;; look at the 'state members, expecially 'inBuffer print first inf inf: next inf ] close inf ;; here is the input text file line one line two line three line four line five ;; comments follow: 1)the tail test fails in direct mode 2)the truth test for 'inf is not helpful either. 3)It looks to me like direct mode *is* buffered after the first read 4)The termination test could be something like if all[string? inf/states/inBuffer empty? inf/states/inbuffer][break] 5)But we still have buffered input right? What do you all think? Thanks

 [2/3] from: sqlab:gmx at: 23-Oct-2007 21:29


If you open bigger files, you will see, that not the whole file is read ahead. The termination can look like this inf: open/direct/lines %file while [line: pick inf 1] [probe line] -------- Original-Nachricht --------
> Datum: Tue, 23 Oct 2007 10:46:14 -0800 > Von: Tim Johnson <tim-johnsons-web.com>
<<quoted lines omitted: 55>>
> To unsubscribe from the list, just send an email to > lists at rebol.com with unsubscribe as the subject.
-- Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kanns mit allen: http://www.gmx.net/de/go/multimessenger

 [3/3] from: tim-johnsons::web::com at: 23-Oct-2007 11:55


On Tuesday 23 October 2007, Anton Reisacher wrote:
> If you open bigger files, you will see, that not the whole file is read > ahead.
Aha!
> The termination can look like this > > inf: open/direct/lines %file > while [line: pick inf 1] [probe line]
Understood. thanks Anton. That's a big help. Tim

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted