Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Large Files and Binary Read

From: gscottjones:mchsi at: 20-Oct-2002 17:37

From: "James Marsden"
> Yeah now I found that bug.. grrrr.. it lets me > get the first block of data then repeats endlessly. > > Anyone suggest a fix?
... Hi, James, I hoped you wouldn't be back, which would be good news, but I suspected that you would be back. :( There is no direct working substitute that I am aware of for a true seek (skip in REBOLese). When needing to skip through data while using /direct/binary in combination on a local file, the only thing that I am aware of is to open the file, then "waste" parts of the file as a way to simulate skipping. Given that it is in direct mode, the memory is not being eaten up by an ever expanding buffer. However, you are, in essence, cycling through *all* the data, which may be substantial in file sizes to which you have refered. Also, there is a good chance that the first block of info you got that you thought was correct was probably in fact an incorrect block (it was probably the beginning of the file, even though you used read/direct/binary/skip). So that we "know" what we are dealing with, I greated a very small file with repeating data by column. Here is a ten row matrix with a hex in each column: blk: copy [] loop 10 [repeat n 16 [append blk skip to-hex n - 1 7]] write %//windows/desktop/test.txt rejoin blk Now, when you practice with your actual algorithm, you'll be able to see that you in fact have the correct columns. Now for one of many, many variations to show how to pseudo skip through your data: rows: 10 cols: 16 data-length: 4 start-col: 3 data-slice: copy "" data: open/direct/binary %//windows/desktop/test.txt repeat r rows [ ;skip to proper column copy/part data start-col - 1 ;collect some data append data-slice to-string copy/part data data-length ;skip to end of column copy/part data cols - start-col - data-length + 1 ] close data probe data-slice The most pertinent part is the "throw-away" copy/part statements. The rest was just my arbitrary controls to cycle by rows (hey, it was a quick and dirty hack! :-). For huge row counts but with nominal column counts, I suspect you will actually want to read in a buffered row of data at a time, and then parse the proper column stuff out. This would help to reduce disk access while protecting memory. If the column count and row counts are huge, then I suspect grabbing a sector of disk data at a time would be more efficient, but more work controlling the column access algorithm. Hope this makes some sense. Out of time. Good luck. --Scott Jones