[REBOL] Re: Compression
From: joel:neely:fedex at: 27-Jun-2001 2:04
Hi, Louis,
Dr. Louis A. Turk
wrote:
> It might be the second possibility. If I type the following
> at the command prompt it works:
>
> write/binary %data.r decompress read/binary %/a/data.r
>
> If compress/decompress is used for every write, then
> performance will be affected. I just want to compress the
> data so that more will fit on a floppy disk for backup
> purposes. I don't want the database compressed during
> ordinary usage. How do I get around this problem?
>
I can do all of
>> write/binary %compd.rz compress read %compd.r
>> do decompress read/binary %compd.rz
>> timeall/run 10 10 ;a function defined in compd.r
;... normal output occurs
>> write/binary %foo.r decompress read/binary %compd.rz
without incident, after which diff shows compd.r and foo.r
to be identical.
Given a data file, as below
$ cat whatever.data
1234 "Ferd Burfel" 127.0.0.1 #901-555-1212
2345 "Joe Doaks" 127.255.255.255 #615.555.1212
3456 "Patrick Henry" 255.255.255.0 #206.555.1212
$
I can do all of the following without problems
>> write/binary %whatever.rz compress read %whatever.data
>> foo: load decompress read/binary %whatever.rz
== [1234 "Ferd Burfel" 127.0.0.1 #901-555-1212
2345 "Joe Doaks" 127.255.255.255 #615.555.1212
3456 "Patrick Henry" 255.25...
>> print foo
1234 Ferd Burfel 127.0.0.1 901-555-1212 2345 Joe Doaks
127.255.255.255 615.555.1212 3456 Patrick Henry 255.255.255.0
206.555.1212
Therefore, your core strategy could be
1) decompress the data after reading from file
2) modify the memory data as needed
3) compress the data and write back to file
Of course, this provides lots of failure modes:
a) an error could occur during (2)
b) the system could hang/crash during (2)
c) a problem during (3) could smash the data beyond repair
...etc...
Problems (a) and (b) could occur even if you were using an
uncompressed file. The consequence is loss of modifications
since last write.
Problem (c) could also occur, but you'd have more of a chance
of recovering some data by hand if the file were uncompressed.
The worst-case consequence is loss of all data.
Standard safety techniques include
i) backing up between sessions
ii) saving the data (whether compressed or not) between
modifications within a session
iii) writing a "journal file" entry for each modification,
within a session, combined with doing (I)
Option (i) is coarse-grained safety, with lowest cost. Option
(ii) is most costly in time, and even more so if compressing
with each save-to-file operation. Option (iii) is often a
reasonable compromise, but does require that you have a
recovery script that is able to take a stable image file and
replay
the journal entries (add/change/delete) to bring back
to the last checkpointed status.
YMMV, but I'd say that reinventing all of the functionality
and security of a real database engine from scratch is a
very non-trivial task. Good luck!
-jn-
------------------------------------------------------------
Programming languages: compact, powerful, simple ...
Pick any two!
joel'dot'neely'at'fedex'dot'com