[REBOL] Re: Deleteing lines from a big file
From: jelinem1:nationwide at: 17-Jul-2001 13:15
I thought it might be interesting to give another solution that I've seen
implemented in practice.
We had many flat data files, some very large, others not, of records with
variable length. The files were sorted nightly; IIRC sorting aided
performance, but wasn't required. They were treated as relational data,
and the records were subject to the usual CRUD operations. Due to the
performance hit of having to re-copy a file every time a record is updated
or deleted, this is what was done: the line-to-delete was located in the
file and overwritten with blanks (records added were simply appended onto
the end of the file. Updates were a combination of delete/add). These
blanks lines would float to the top of the file during the nightly sort,
and would be removed at that time.
I didn't write the system so I don't know the mechanics of how the lines
were removed once they had "floated to the top". Another file copy?
The point was that the performance hit of copying the file would occur
only once/24hrs, at night: during off-hours, while still "deleting" the
data immediately.
- Michael Jelinek
From: Holger Kruse <[holger--rebol--com]>@rebol.com on 07/17/2001 11:51 AM
Please respond to [rebol-list--rebol--com]
Sent by: [rebol-bounce--rebol--com]
To: [rebol-list--rebol--com]
cc:
Subject: [REBOL] Re: Deleteing lines from a big file
On Mon, Jul 16, 2001 at 03:23:40PM -0700, Ken Vincent wrote:
> Greetings,
>
> I'm looking for a way to delete problematic lines from very large data
files (too large for buffered access).
> row-to-delete: 25
> condition: "00000027"
>
> big-file: open/direct/lines %VeryBigDataFile.DAT
> skip big-file (row-to-delete - 1)
> line-to-check: first big-file ; confirm that this is the correct line
to delete
> if (find line-to-check condition) <> none [remove line] <-------------
Is there some way to do this???
No, because operating systems do not allow this. File data is typically
stored
consecutively, so you cannot delete something in the middle of a file. The
best
solution is to copy the file to another file, line by line, in /direct
mode,
skipping the line you want to delete. Then delete the old file and rename
the
new one.
--
Holger Kruse
[holger--rebol--com]