Compression

[1/36] from: ptretter:charter at: 17-Apr-2001 11:14

Coming soon! Image putting greater than 3600 CD's worth of data on a floppy disk. Or take a Gigabyte worth of data and compress it to under 400bytes. I purchased /View/Pro and will most likely purchase runtime licenses once they are available for /View and begin distribution of NEW compression software depending on the licensing terms available. Imagine with 400 byes of compressed data (representing a gig or more) what this could mean to handheld devices like portable mp3 players or digital cameras. The capablity to store your entire mp3 collection in a portable device. I know your interested but you will have to wait a bit longer. Thanks to the guys in the IRC REBOL channel on EFNET for your help and cooperation (you know who you are). Paul Tretter

[2/36] from: depotcity:telus at: 17-Apr-2001 9:38

I find this a bit hard to swallow. 2,340 Gigabytes on one floppy? Pull that one off SUCCESSFULLY and I'll see you get the Nobel Prize. T Brownell ----- Original Message ----- From: "Paul Tretter" <[ptretter--charter--net]> To: "[Rebol-List--Rebol--Com]" <[rebol-list--rebol--com]> Sent: Tuesday, April 17, 2001 9:14 AM Subject: [REBOL] Compression

> Coming soon! Image putting greater than 3600 CD's worth of data on a

floppy disk. Or take a Gigabyte worth of data and compress it to under 400bytes. I purchased /View/Pro and will most likely purchase runtime licenses once they are available for /View and begin distribution of NEW compression software depending on the licensing terms available. Imagine with 400 byes of compressed data (representing a gig or more) what this could mean to handheld devices like portable mp3 players or digital cameras. The capablity to store your entire mp3 collection in a portable device. I know your interested but you will have to wait a bit longer. Thanks to the guys in the IRC REBOL channel on EFNET for your help and cooperation (you know who you are).

[3/36] from: mat:eurogamer at: 17-Apr-2001 17:54

Heya Paul, PT> Imagine with 400 byes of compressed data (representing a gig or PT> more) what this could mean to handheld devices like portable mp3 PT> players or digital cameras. The capablity to store your entire mp3 PT> collection in a portable device. Is it especially good crack where you come from? -- Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee http://www.eurogamer.net | http://www.eurogamer-network.com

[4/36] from: ptretter:norcom2000 at: 17-Apr-2001 12:04

Actually I expected that kind of remark. :) I also expect more of the same and looked in the mirror several times asking similiar questions. Paul Tretter

[5/36] from: ptretter:charter at: 17-Apr-2001 12:01

Wow that sounds like a big number. Actually though its more like 65 terabytes given some of REBOL's limitations with number size but feel thats acceptable. :) Can I hold you to that Nobel Peace prize? ;) Actually, economics and time play the most inportant part in the amount of compression needed. For example it would take very long time to uncompress 2 gig worth of data from 400 bytes. That may not be a the best means of data handling depending on the situation. However, I for example put an entire CD image on a website (compress to less than 400 bytes) then download it over a dialup line and uncompress it very easily with very little possibe risk of corruption due to the small footprint. I would be skeptical also of this and had doubts and many road blocks. However, it can and will be available soon. Thanks for the comments. Paul Tretter

[6/36] from: ryanc:iesco-dms at: 17-Apr-2001 10:17

Paul, I am sure you can understand that most people have a hard time believing you, as you may have guessed. If I did not know you, I would be be laughing at you. Bypassing argument of whether or not you are correct, how far have you gotten? What sort of time does it to compress something? Generally speaking, how does it work? BTW, make sure to fill out a patend disclosure immediately! --Ryan Paul Tretter wrote:

> Coming soon! Image putting greater than 3600 CD's worth of data on a floppy disk. Or take a Gigabyte worth of data and compress it to under 400bytes. I purchased /View/Pro and will most likely purchase runtime licenses once they are available for /View and begin distribution of NEW compression software depending on the licensing terms available. Imagine with 400 byes of compressed data (representing a gig or more) what this could mean to handheld devices like portable mp3 players or digital cameras. The capablity to store your entire mp3 collection in a portable device. I know your interested but you will have to wait a bit longer. Thanks to the guys in the IRC REBOL channel on EFNET for your help and cooperation (you know who you are). > > Paul Tretter > > -- > To unsubscribe from this list, please send an email to > [rebol-request--rebol--com] with "unsubscribe" in the > subject, without the quotes.

-- Ryan Cole Programmer Analyst www.iesco-dms.com 707-468-5400 I am enough of an artist to draw freely upon my imagination. Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world. -Einstein

[7/36] from: ryan:christiansen:intellisol at: 17-Apr-2001 12:26

Well, best of luck! At least make C|Net aware of your compression utility and give REBOL lots of credit. :) Paul wrote... However, I for example put an entire CD image on a website (compress to less than 400 bytes) then download it over a dialup line and uncompress it very easily with very little possibe risk of corruption due to the small footprint. I would be skeptical also of this and had doubts and many road blocks. However, it can and will be available soon. Thanks for the comments. Ryan C. Christiansen Web Developer Intellisol International 4733 Amber Valley Parkway Fargo, ND 58104 701-235-3390 ext. 6671 FAX: 701-235-9940 http://www.intellisol.com Global Leader in People Performance Software _____________________________________ Confidentiality Notice This message may contain privileged and confidential information. If you think, for any reason, that this message may have been addressed to you in error, you must not disseminate, copy or take any action in reliance on it, and we would ask you to notify us immediately by return email to [ryan--christiansen--intellisol--com]

[8/36] from: mat:eurogamer at: 17-Apr-2001 18:29

Heya Paul, PT> Actually I expected that kind of remark. :) I also expect more of the same PT> and looked in the mirror several times asking similiar questions. Don't worry, the effects wear off after awhile. -- Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee http://www.eurogamer.net | http://www.eurogamer-network.com

[9/36] from: ptretter:norcom2000 at: 17-Apr-2001 12:27

I can understand that its hard to believe but I struggled with this along time. I couldnt sleep at times. I tried talking to math professionals looking for an answer but couldnt get one. Slowely, I started to understand what couldnt be done and scrapped it immediately. I also struggled with number size and had to find alternatives. Finally, I found the solution which is very unique compared to current compression methods. That was the hardest part. Now the next pieces will be easy. To create the software to perform the calculations. Once I get that piece in place then I can move much more agressively with a prototype. Paul Tretter ----- Original Message ----- From: "Ryan Cole" <[ryanc--iesco-dms--com]> To: <[rebol-list--rebol--com]> Sent: Tuesday, April 17, 2001 12:17 PM Subject: [REBOL] Re: Compression

> Paul, > I am sure you can understand that most people have a hard time believing

you, as you may have guessed. If I did not know you, I would be be laughing at you. Bypassing argument of whether or not you are correct, how far have you gotten? What sort of time does it to compress something? Generally speaking, how does it work?

> BTW, make sure to fill out a patend disclosure immediately! > > --Ryan > > Paul Tretter wrote: > > > Coming soon! Image putting greater than 3600 CD's worth of data on a

[10/36] from: doug:vos:eds at: 17-Apr-2001 13:37

When can I buy a copy?

[11/36] from: mat:eurogamer at: 17-Apr-2001 18:42

Heya Paul, PT> I tried talking to math professionals looking for an answer but PT> couldnt get one. None of them explained the laws of entropy to you then? -- Mat Bettinson - EuroGamer's Gaming Evangelist with a Goatee http://www.eurogamer.net | http://www.eurogamer-network.com

[12/36] from: depotcity:telus at: 17-Apr-2001 10:50

2000 megs = 400 bytes or 20 megs = 4 bytes Even the word "vapour" as in "vapourware" is 6 bytes. T Brownell

[13/36] from: ryanc:iesco-dms at: 17-Apr-2001 11:38

While I tend to think Paul is mistaken, take in mind that fractal generators may be infinitely small compared to the data that they can produce. 3 / 7 is liberally 3 bytes, how many megs of data can it produce? --Ryan Terry Brownell wrote:

> 2000 megs = 400 bytes or > 20 megs = 4 bytes

<<quoted lines omitted: 4>>

> [rebol-request--rebol--com] with "unsubscribe" in the > subject, without the quotes.

[14/36] from: depotcity:telus at: 17-Apr-2001 11:55

That's like saying how many megs of data can Pi produce. TBrownell

[15/36] from: ryanc:iesco-dms at: 17-Apr-2001 12:55

Terry Brownell wrote:

> That's like saying how many megs of data can Pi produce. > > TBrownell

Exactly my point. So far it seems to be an awful lot. I would say the real problem is that 3 bytes can only contain about 1.7 million different combinations, so at maximum, only that number of documents could be compressed using a truly optimum technique. Such a technique would leave a vast number of documents in the world, and times to come, uncompressable or requiring more than 3 bytes. Several, several years ago I had come up with an compression scheme that could be redudantly compressed over and over, so that you could simply just keep on compressing until the document reached a miniscule size. After writing the program I discovered the only problem was that almost everything I compressed ended up larger than it was before I "compressed" it. However it did make an interesting encryption program. :^) --Ryan

[16/36] from: joel:neely:fedex at: 17-Apr-2001 16:56

It's a good thing this thread was posted to the REBOL mailing list instead of a hard-core tech list like cypherpunks (at least before it got covered over with spam). Those folks had NO patience with technical faux pas nor naivet�. First, let's remember the difference between lossy and lossless (de)compression. Lossy compression schemes (e.g. JPEG) approximate original data in a way that takes less data (i.e., increased compression ratios) to achieve a poorer approximation. In other words, the more you compress, the worse the reconstructed data compare with the original. This works well (up to a point) with photos meant to be viewed by humans, since we don't notice the noise of the approximation as being too different from the normal background texture of most images. But try to use JPEG on a simple "spot color" graphic, and you'll see the effects VERY quickly. Lossless compression schemes (e.g., RLE, LZW, etc.) operate by finding patterns in the original data and replacing them with what amounts to instructions that can be followed to reproduce the patterns exactly. In general, lossless compression schemes don't advertise the compression rates of lossy schemes, but that's the price you pay for perfect reproduction (such as you MUST have for executable code, for example). As Ryan mentioned in another post, 3 bytes can only represent 16777216 distinct values. A quick calculation from the email I am replying to (considering only spaces and letters) shows an entropy of ~4.166 bits per character. That means that the set of all possible 3-byte binary values could only code the set of all possible ~5.76-character messages (made up of only space and letters, conforming to the original source model). Therefore, any lossless compression scheme over all messages in this population will top out at about 48% savings. Ryan Cole wrote:

> While I tend to think Paul is mistaken, take in mind that fractal > generators may be infinitely small compared to the data that they > can produce. 3 / 7 is liberally 3 bytes, how many megs of data can > it produce? >

It doesn't matter. Although the sequence "3/7" is a valid encoding for the infinite message 0.42857142857142857142857142857142857142857142857... (and therefore highly efficient ;-) I challenge you to find an equally compact encoding for the highly-similar message 0.42857142857142857142857142857142857142857412857... (yes, they are different, if you look closely enough). If both of these messages are in the set of possible messages I need to be able to encode, then the average cost of an "a/b" encoding begins to cost more as the set of possible messages grows. What makes the whole system cost even more is that you also have to take the size of the (de)compression algorithm itself into account. Consider that the absolute best possible compression technique (averaged, again, over the entire set of messages capable of being handled) would be to use a dictionary containing every possible message. If all messages were equally likely, the best possible compression would be to represent each message by its position in the dictionary (in binary, of course). Finally, I don't claim comprehensive knowledge, but everything I've read about "fractal compression" makes it sound like a lossy compression scheme. -jn-

[17/36] from: timewarp:sirius at: 17-Apr-2001 14:38

don't knock it, genius is heaven born. i happen to know that it is possible to design and build an npmg (near perpetual motion generator) ... such things break all the rules and do not conform to how anyone sees anything "today". watch, this compression utility will work and it will be fast. cheerfulness and have faith in the impossible, -----EAT Mat Bettinson wrote:

[18/36] from: ryanc:iesco-dms at: 17-Apr-2001 16:32

Joel Neely wrote:

<snip> > It doesn't matter. Although the sequence "3/7" is a valid encoding

<<quoted lines omitted: 3>>

> equally compact encoding for the highly-similar message > 0.42857142857142857142857142857142857142857412857...

Not exactly equally compact, but how about this: 23 / 7 S 6 23 divided by 7 skipping the first 6 digits.

> (yes, they are different, if you look closely enough). If both of > these messages are in the set of possible messages I need to be able

<<quoted lines omitted: 16>>

> [rebol-request--rebol--com] with "unsubscribe" in the > subject, without the quotes.

I know it does not work, but I could see the expression on your face from here. lol, lol, lol, lol. I am suprised you didnt here my laughs. Wew! I will be getting a good chuckle for awhile, thanks! Of course if I knew how to do such things Joel, you would have heared about it on the 6 o'clock news. ;^) Cheers, --Ryan Ryan Cole Programmer Analyst www.iesco-dms.com 707-468-5400 I am enough of an artist to draw freely upon my imagination. Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world. -Einstein

[19/36] from: depotcity:telus at: 17-Apr-2001 18:16

Have faith in the impossible? Next thing you'll be telling us is XML is the answer to the semantic web! :) TBrownell

[20/36] from: joel:neely:fedex at: 17-Apr-2001 17:12

Ryan Cole wrote:

> Joel Neely wrote: > <snip>

<<quoted lines omitted: 10>>

> 23 / 7 S 6 > 23 divided by 7 skipping the first 6 digits.

...

> I know it does not work, but I could see the expression on your face > from here. lol, lol, lol, lol. > I am suprised you didnt here my laughs. Wew! I will be getting a good > chuckle for awhile, thanks! Of course if I knew how to do such things > Joel, you would have heared about it on the 6 o'clock news. ;^) >

Well, sorry to disappoint on all three counts: 1) The correct answer is 30000000000000000000000000000000000000000189 -------------------------------------------- 70000000000000000000000000000000000000000000 2) Your visual imagination is better than my auditory perception. ;-) 3) I won't be on PBS, as the producers of Nova were not sufficiently impressed with the production of the above answer. :-( The real point of this little exercise, however, may have gotten lost in the humor. Consider the following two messages: Dear sir, we find that we are so overwhelmed with your ability to perform arithmetic with numbers of more than two digits that we would be delighted to accept your gracious invitation, and will plan to visit your residence for an interview at your earliest convenience. To be brief, yes. and Dear sir, madam, or whatever the case may be: Our attorneys will be contacting you to serve a court order requiring that you cease and desist any further attempts to harrass our receptionist or other employees with your constant requests that we devote air time to the fact that you can do arithmetic. To be blunt, no. If those are the only two messages possible, I can compress each of them to a single bit! The same would be true if each message were composed in triplicate, with the second copy in Anglo-Saxon and the third in Mandarin (which, for the sake of convenience, we'll assume to be represented phonetically in ASCII). The number of bits has nothing to do with the "length" of the message, but with the total number of messages that are possible, and with their relative probabilities. Therefore, if someone claims to be able to condense the content of 3600 CDs onto a single floppy disk, I'd respond that it's entirely possible IF THERE ARE ONLY A RELATIVELY SMALL NUMBER OF POSSIBLE CDs. For example, if there are only 4096 possible CDs, one can code each one by a 12-bit value, regardless of how much data may be on each CD. Sending a 3600 CD message would then require only 675 bytes. (Of course, decompressing that 675 bytes would require that the recipient already have access to the content of a full set of such CDs!) OTOH, if a CD can contain 600Mb of arbitrary data, then a collection of 3600 CDs equates to approximately 2.16 terabytes. Any claim to be able to compress THAT collection of data down to 1.44Mb, if any CD is equally likely to appear, simply violates the laws of physics. It ain't gonna happen. I mean no discouragement to Paul in his research into data compression techniques. It's always possible to find special-case algorithms for special-case data, and some of them have interesting applications, such as JPEG compression of images and MP3 compression of music for human consumption. However, improved performance via a special-case approach always has the cost of narrowing the range of its applicability and/or having the effect you mentioned in an earlier post -- actually making data outside its "target zone" grow significantly. What's interesting is that there's actually a kind of conservation law here; the better it gets inside the target zone, the worse it gets outside. You can't win, you can't break even, and you can't get out of the game. -jn-

[21/36] from: dankelg8:cs:man:ac at: 18-Apr-2001 12:43

Strange. Do you have a different calendar system where you live? 1st of April has long passed in this part of the world :) Gisle On Tue, 17 Apr 2001, Paul Tretter wrote:

[22/36] from: jeff:rebol at: 18-Apr-2001 8:53

Hello, EAT:

> don't knock it, genius is heaven born. i happen to know > that it is possible to design and build an npmg (near > perpetual motion generator) ...

You've aroused my curiosity... how NEAR to perpetual motion are we talking? Like almost perpetual? Real close to perpetual? Just short of perpetual? I mean, how much closer, would you estimate, to perpetual motion can you attain with the device you mention than, say, your typical NON-perpetual motion device (like a YO-YO, for instance)? Like MUCH MUCH closer to perpetual? And when you say "perpetual motion generator" I have to ask, what does it generate perpetual motion for? How does the perpetual motion come out of the generator?

> such things break all the rules and do not conform to how anyone > sees anything "today".

Your use of quotes around the word "today" confuses me. Usually quoting a word indicates something with questionable definition, like a jargon term, or something that is qualified in its usage. For example, I might say: This near perpetual motion generator should work "forever".

> watch, this compression utility will work and it will be > fast. > > cheerfulness and have faith in the impossible,

Certainly. By definition the impossible happens all the time! If it weren't impossible then it wouldn't not not be possible! -jeff

[23/36] from: m:koopmans2:chello:nl at: 18-Apr-2001 20:08

Hey Jeff, Didn't know you were a physicist too :) ooooops, blown my cover... --Maarten

[24/36] from: louisaturk:eudoramail at: 26-Jun-2001 21:48

Hi everybody, This works: write/binary %/a/data.r compress read/binary %data.r But this produces an error message immediately when I do the program: write/binary %data.r decompress read/binary %/a/data.r

>> do %db.r

** Syntax Error: Missing [ at end-of-block ** Near: (line 2) ��]@��$��ָuv��y�'EQ$�\f�b�#�� w�y��W�,GT print %data.r decompress read/binary %/a/data.r works fine. What is causing the error message? Louis

[25/36] from: brett:codeconscious at: 27-Jun-2001 12:27

Hi Louis, Two possibilities. 1) Check that you have not accidently compressed %db.r itself. 2) I note from one of your earlier script that the load-data function has the following line: data: load/all db-file Which attempts to load without decompressing - maybe that is your problem. Brett. ----- Original Message ----- From: "Dr. Louis A. Turk" <[louisaturk--eudoramail--com]> To: <[rebol-list--rebol--com]> Sent: Wednesday, June 27, 2001 12:48 PM Subject: [REBOL] Compression Hi everybody, This works: write/binary %/a/data.r compress read/binary %data.r But this produces an error message immediately when I do the program: write/binary %data.r decompress read/binary %/a/data.r

>> do %db.r

** Syntax Error: Missing [ at end-of-block ** Near: (line 2) Y�]@��$�'ָuv��yz'EQ$�\f�b�#�� w�yz��W�,GT print %data.r decompress read/binary %/a/data.r works fine. What is causing the error message? Louis

[26/36] from: louisaturk:coxinet at: 26-Jun-2001 21:37

Hi everybody, This works: write/binary %/a/data.r compress read/binary %data.r But this this produces an error message immediately when I do the program: write/binary %data.r decompress read/binary %/a/data.r

>> do %db.r

[27/36] from: louisaturk:eudoramail at: 27-Jun-2001 0:21

Hi Brett, Thanks for the response. It might be the second possibility. If I type the following at the command prompt it works: write/binary %data.r decompress read/binary %/a/data.r If compress/decompress is used for every write, then performance will be affected. I just want to compress the data so that more will fit on a floppy disk for backup purposes. I don't want the database compressed during ordinary usage. How do I get around this problem? Louis At 12:27 PM 6/27/2001 +1000, you wrote:

[28/36] from: brett:codeconscious at: 27-Jun-2001 14:55

Hi Louis,

> It might be the second possibility. If I type the following at the

command

> prompt it works: > > write/binary %data.r decompress read/binary %/a/data.r

Ok this is the real problem - you use the same name for the uncompressed version as you do for the compressed version. Logically it works but in practise it is not a good idea. This is risky. You are increasing your risks of losing data. Why? Because devices (computers) sometimes fail. It might be a floppy (quite prone to failure) or your hard disk or a power shortage - whatever. When you are transforming your data so significantly as in the compression/decompression steps you are taking, you should write the transformed data to a new file. While on risks, the Rebol User guide warns about storing data using compression - if a couple of bits are changed by some stray cosmic radiation or whatever - you're more likely to be unable to recover them.

> If compress/decompress is used for every write, then performance will be > affected.

There's always a trade-off. With compression you are exchanging processing power for disk space. You need to choose what is more important to you taking into consideration the amount of information you are dealing with, the patterns of your use, the equipment you will use it on, etc...

> I just want to compress the data so that more will fit on a > floppy disk for backup purposes.

Fair enough. So just do the compression using another script when you want to perform a backup. If you are using a different name there will be no confusion. Floppies are small on price and offer similar protection for backing up data. Multiple backups of the same data would be a good idea. Depends on how critical your data is. Oh and don't forget to check that you can read a file from a floppy once you have written to it - always good to test your backups actually are backups and not placebos.

> I don't want the database compressed > during ordinary usage. How do I get around this problem?

The solution is simply to leave data.r uncompressed and use it as normal. Then as mentioned above compress the information it contains when you need to and save to a different name and possibly a different location. Brett

[29/36] from: arolls:bigpond:au at: 27-Jun-2001 16:15

I noticed that the straight text compression is generally better than the binary mode compression - Try this in a directory with text files: foreach f read %./ [if not dir? f [ print [ f length? read f length? compress read f length? read/binary f length? compress read/binary f ] ]] I saw for most files the second number smaller than the fourth. I had no trouble using your method to 'do compressed scripts. I think Brett is right, and you have overwritten your original filename at some point, or the floppy data is a bit wrong. Overwriting a file is a common way to lose data. I was most happy to learn that rebol's 'rename function prevents you from overwriting a file that exists already. Anton.

[30/36] from: joel:neely:fedex at: 27-Jun-2001 1:38

Hi, Louis, Try using the same file you just decompressed! ;-) Dr. Louis A. Turk wrote:

> Hi everybody, > This works:

<<quoted lines omitted: 9>>

> works fine. > What is causing the error message?

Your transcript shows me that %data.r -(compress)-> %/a/data.r -(decomp)-> %data.r after which you try to do %db.r which isn't any of the above files. What's in db.r??? -jn- -- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

[31/36] from: joel:neely:fedex at: 27-Jun-2001 1:47

Hi, Brett, Brett Handley wrote:

> Hi Louis, > > It might be the second possibility. If I type the following

<<quoted lines omitted: 4>>

> uncompressed version as you do for the compressed version. > Logically it works but in practise it is not a good idea.

That would be true only if he had actually done change-dir %/a/ prior to the line above. If REBOL still thought the current directory were anywhere else, then the line above uses the same name but in a different directory. Thus it wouldn't be the real problem. Please see my response to Louis for what I believe to be the real source of his difficulties. -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

[32/36] from: joel:neely:fedex at: 27-Jun-2001 2:04

Hi, Louis, Dr. Louis A. Turk wrote:

> It might be the second possibility. If I type the following > at the command prompt it works:

<<quoted lines omitted: 4>>

> purposes. I don't want the database compressed during > ordinary usage. How do I get around this problem?

I can do all of

>> write/binary %compd.rz compress read %compd.r >> do decompress read/binary %compd.rz >> timeall/run 10 10 ;a function defined in compd.r

;... normal output occurs

>> write/binary %foo.r decompress read/binary %compd.rz

without incident, after which diff shows compd.r and foo.r to be identical. Given a data file, as below $ cat whatever.data 1234 "Ferd Burfel" 127.0.0.1 #901-555-1212 2345 "Joe Doaks" 127.255.255.255 #615.555.1212 3456 "Patrick Henry" 255.255.255.0 #206.555.1212 $ I can do all of the following without problems

>> write/binary %whatever.rz compress read %whatever.data >> foo: load decompress read/binary %whatever.rz

== [1234 "Ferd Burfel" 127.0.0.1 #901-555-1212 2345 "Joe Doaks" 127.255.255.255 #615.555.1212 3456 "Patrick Henry" 255.25...

>> print foo

1234 Ferd Burfel 127.0.0.1 901-555-1212 2345 Joe Doaks 127.255.255.255 615.555.1212 3456 Patrick Henry 255.255.255.0 206.555.1212 Therefore, your core strategy could be 1) decompress the data after reading from file 2) modify the memory data as needed 3) compress the data and write back to file Of course, this provides lots of failure modes: a) an error could occur during (2) b) the system could hang/crash during (2) c) a problem during (3) could smash the data beyond repair ...etc... Problems (a) and (b) could occur even if you were using an uncompressed file. The consequence is loss of modifications since last write. Problem (c) could also occur, but you'd have more of a chance of recovering some data by hand if the file were uncompressed. The worst-case consequence is loss of all data. Standard safety techniques include i) backing up between sessions ii) saving the data (whether compressed or not) between modifications within a session iii) writing a "journal file" entry for each modification, within a session, combined with doing (I) Option (i) is coarse-grained safety, with lowest cost. Option (ii) is most costly in time, and even more so if compressing with each save-to-file operation. Option (iii) is often a reasonable compromise, but does require that you have a recovery script that is able to take a stable image file and replay the journal entries (add/change/delete) to bring back to the last checkpointed status. YMMV, but I'd say that reinventing all of the functionality and security of a real database engine from scratch is a very non-trivial task. Good luck! -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

[33/36] from: agem:crosswinds at: 27-Jun-2001 14:36

RE: [REBOL] Re: Compression [louisaturk--eudoramail--com] wrote:

> Hi Brett, > Thanks for the response.

<<quoted lines omitted: 5>>

> floppy disk for backup purposes. I don't want the database compressed > during ordinary usage. How do I get around this problem?

add some little bytes at the start which tells if file is compressed. something like: read-db: does[ data: read/binary %/a/data.r parse/all data [copy type to ":" copy data to end] if "compressed:" = type [data: decompress data] ] write-db: does[ write/binary %data.r join "uncompressed:" data ] write-db-compressed: does[ write/binary %data.r join "compressed:" compress data ] write-db-compressed read-db

[34/36] from: brett:codeconscious at: 27-Jun-2001 22:40

> Please see my response to Louis for what I believe to be the > real source of his difficulties.

Well you've got two on the list currently that I can see Joel. (a) "Try using the same file you just decompressed!" or (b) "...reinventing all of the functionality and security of a real database engine from scratch is a very non-trivial task. Good luck!" (a) is the concrete level. (b) is the overview level. My response to Louis was positioned somewhere between these two. I hope that helps you understand my message now. Brett.

[35/36] from: joel:neely:fedex at: 27-Jun-2001 3:34

Hi, Brett, Sorry for the lack of clarity! Brett Handley wrote:

> > Please see my response to Louis for what I believe to be the > > real source of his difficulties. > > Well you've got two on the list currently that I can see Joel. > > (a) "Try using the same file you just decompressed!" >

That's the one I meant. -jn- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com

[36/36] from: louisaturk:eudoramail at: 7-Jul-2001 18:11

Hi Brett, Joel, Volker, and Anton, Many thanks for your responses. Sorry to take so long to respond. I'm studying all that you have said. It sounds like compression greatly increases the chance of loosing data. I think I'll have to postpone using this feature until I have more time to study it deeper. Louis At 10:40 PM 6/27/2001 +1000, you wrote:

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted