Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Compression

From: joel:neely:fedex at: 17-Apr-2001 17:12

Ryan Cole wrote:
> Joel Neely wrote: > <snip> > > > > > ... Although the sequence "3/7" is a valid encoding for... > > > > 0.42857142857142857142857142857142857142857142857... > > > > ... I challenge you to find an equally compact encoding for... > > > > 0.42857142857142857142857142857142857142857412857... > > > > Not exactly equally compact, but how about this: > > 23 / 7 S 6 > > 23 divided by 7 skipping the first 6 digits. >
...
> I know it does not work, but I could see the expression on your face > from here. lol, lol, lol, lol. > I am suprised you didnt here my laughs. Wew! I will be getting a good > chuckle for awhile, thanks! Of course if I knew how to do such things > Joel, you would have heared about it on the 6 o'clock news. ;^) >
Well, sorry to disappoint on all three counts: 1) The correct answer is 30000000000000000000000000000000000000000189 -------------------------------------------- 70000000000000000000000000000000000000000000 2) Your visual imagination is better than my auditory perception. ;-) 3) I won't be on PBS, as the producers of Nova were not sufficiently impressed with the production of the above answer. :-( The real point of this little exercise, however, may have gotten lost in the humor. Consider the following two messages: Dear sir, we find that we are so overwhelmed with your ability to perform arithmetic with numbers of more than two digits that we would be delighted to accept your gracious invitation, and will plan to visit your residence for an interview at your earliest convenience. To be brief, yes. and Dear sir, madam, or whatever the case may be: Our attorneys will be contacting you to serve a court order requiring that you cease and desist any further attempts to harrass our receptionist or other employees with your constant requests that we devote air time to the fact that you can do arithmetic. To be blunt, no. If those are the only two messages possible, I can compress each of them to a single bit! The same would be true if each message were composed in triplicate, with the second copy in Anglo-Saxon and the third in Mandarin (which, for the sake of convenience, we'll assume to be represented phonetically in ASCII). The number of bits has nothing to do with the "length" of the message, but with the total number of messages that are possible, and with their relative probabilities. Therefore, if someone claims to be able to condense the content of 3600 CDs onto a single floppy disk, I'd respond that it's entirely possible IF THERE ARE ONLY A RELATIVELY SMALL NUMBER OF POSSIBLE CDs. For example, if there are only 4096 possible CDs, one can code each one by a 12-bit value, regardless of how much data may be on each CD. Sending a 3600 CD message would then require only 675 bytes. (Of course, decompressing that 675 bytes would require that the recipient already have access to the content of a full set of such CDs!) OTOH, if a CD can contain 600Mb of arbitrary data, then a collection of 3600 CDs equates to approximately 2.16 terabytes. Any claim to be able to compress THAT collection of data down to 1.44Mb, if any CD is equally likely to appear, simply violates the laws of physics. It ain't gonna happen. I mean no discouragement to Paul in his research into data compression techniques. It's always possible to find special-case algorithms for special-case data, and some of them have interesting applications, such as JPEG compression of images and MP3 compression of music for human consumption. However, improved performance via a special-case approach always has the cost of narrowing the range of its applicability and/or having the effect you mentioned in an earlier post -- actually making data outside its "target zone" grow significantly. What's interesting is that there's actually a kind of conservation law here; the better it gets inside the target zone, the worse it gets outside. You can't win, you can't break even, and you can't get out of the game. -jn-