Rebol and a new compression system.
[1/7] from: tbrownell:shaw:ca at: 4-Mar-2002 12:13
In my endless study of language using Rebol tools, I've devised a way to compress correspondence
60% more than the most advanced method currently available.
This works only for communication using standard english words, such as this e-mail,
or an encyclopedia article etc. If the communication is full of non-sensical strings,
such as enbased, it will actually increase the size of the document by 3.
For example, this e-mail is approx 607 bytes raw.. when compressed with winzip is 355
bytes and when compressed using the system is 213 bytes. (If a better compression system
than winzip is used, then the 213 would be even smaller.)
I know the Rebol gang is a techno savvy bunch...question is, is there any practical use
for such a system?
TB
[2/7] from: al:bri:xtra at: 5-Mar-2002 16:29
Terry wrote:
> is there any practical use for such a system?
Would it work well on a Rebol script? So making a Rebol script smaller and
less easier to decode to the uninitiated.
What about a dictionary for spell checking?
What about a thesaurus?
It could be of use in the RebMail project?
Just a few suggestions.
Andrew Martin
ICQ: 26227169 http://valley.150m.com/
[3/7] from: reboler::programmer::net at: 5-Mar-2002 10:20
Terry,
Sounds great! Would have many uses where text is the medium (html, REBOL scripts, e-mail,
etc).
However, I will remain politely skeptical until I see it in action.
While I don't have much experience with the mechanics of compression, I have read some
about it.
And from that I am heartened by your description of giving better compression for a specific
type of file ("This works only for communication using standard english words...").
_A_Lot_ of work has been done in this area, and _many_ claims of a better system have
been proven false. But, many compression schemes are tuned for _any_ type of file (standard
zip , PKZIP WinZip etc), so claims for a better system for a specific type of file are
plausible.
I am openly skeptical about your (implied) claim that you can greatly compress an already
compressed file ("If a better compression system than winzip is used, then the 213 would
be even smaller."). While not impossible, I would like to see if it is a general phenomenon
for any English text file.
Can it be tuned for REBOL scripts?
I've been toying with using huffman-encoding (thanks RebolForces!) on the level of words
instead of characters. Is this similar to your scheme?
[4/7] from: ryanc:iesco-dms at: 5-Mar-2002 11:09
my-comments: decompress #{
789C5D54BB72DC3010EBF5153B6ED2E85CF80F3CA9AECD64929A9256A7CDF115
3E2CEBEF8325CF4AE2E6668E0F000B80BAAEF49D533A683399D857C7C9145EA8
6C8CBF372B79A345E622C19B748C2485F216AA5D680FE94EBFAB14A69DAD7D26
BA624D77E6E0B32C9CA898BBF89BDE099E29178EB4562027327EA1E1A4A34D6E
9B3DA8666EB80AE126F14669733BBC04CAC1B162177E2FD558A04F16A8BB9479
03CD8803906D4A43C8B46FECDB180DF0E9FA44921B431B62E8F738530964283B
632DA74C9394CB2E4BD9743DF16A792E0D65E51DBAA1CB054F21766538B3066B
C38EE99B8D235D573A424D943949A8103F855AC0383ECCC9739249850F4BD87D
3F0BB34C1EDB9C4A05B332CC74EC9545C308A1EFE63A39697A9C724778E761EE
BACACCBA70E38261DAF242113FF005D27E0AE639D787C7867223F637A6173AD8
6078CE593961C5A168CEA43B004F89407AADB37A8F7D5015053961E1EE1BA348
FC0EFC2C6FDC02518E89B500B4229A1F5F29241A0C4D153BE57918AE643E209D
5958F152B825E37A969F2A99780AB6073C02A3C0A645962F30C1D579D37862C2
14A2195575A6399FC5C31F177279006808B1B43AD0D01834AADE1B1F1097FF87
ED6FFD35E50D476FDBC789D5B47A98CFC02661129BD82C7052FC71B99C423373
17ABE42BDC10B5E1F5015304256F679B90F329D5AC662BA526D1EE4D21DC7393
04EBA5151EAC6AF8DC9A16038C989AAF77EEAF07DD659DE295BC89788F89AC60
6F4059DF50A1B1717A9E61A14982ABF1258E5D98D19A67942A44DBBAA6F5B5C1
682923A70FE791D343D7B577EB4C375A7360D7C03FB465D74A0E331C6A2DCA6A
C819FDE5E2C423101DF7BF48571D2D65D68F4DE951E16D200C8CC2B9F7518D84
1B3160B57744DA93A4DDF8D25E0D84747EEDC6152AFAC36C25EE8038AED5ECF7
E6CDF85B3716B2F0F9B0E833CCBCFBB0EB7D878F57A15F15ED8A1057B9277AB9
7C3B8C1F863FE1A4E9E262050000
}
alan parman wrote:
[5/7] from: tbrownell:shaw:ca at: 5-Mar-2002 13:49
Just doing additional testing with the following results
The string to compress...
A sentence is nothing more than a collection of symbols.
Original - 448 bits
Huffman only - 212 bits
LSEC - 136 bits (LFReD Standard English Compression.. our system)
Huffman/LSEC - 64 bits
By first compressing with LSEC, then compressing the results with Huffman
yields a compression of just over 14 % of original size with 0 loss.
Again, this only works with text documents or correspondence, chat dialog,
newspaper articles etc.
Terry Brownell
----- Original Message -----
From: "alan parman" <[reboler--programmer--net]>
To: <[rebol-list--rebol--com]>
Sent: Tuesday, March 05, 2002 7:20 AM
Subject: [REBOL] Re: Rebol and a new compression system.
> Terry,
> Sounds great! Would have many uses where text is the medium (html, REBOL
scripts, e-mail, etc).
> However, I will remain politely skeptical until I see it in action.
> While I don't have much experience with the mechanics of compression, I
have read some about it.
> And from that I am heartened by your description of giving better
compression for a specific type of file ("This works only for communication
using standard english words..."). _A_Lot_ of work has been done in this
area, and _many_ claims of a better system have been proven false. But, many
compression schemes are tuned for _any_ type of file (standard zip , PKZIP
WinZip etc), so claims for a better system for a specific type of file are
plausible.
> I am openly skeptical about your (implied) claim that you can greatly
compress an already compressed file ("If a better compression system than
winzip is used, then the 213 would be even smaller."). While not
impossible, I would like to see if it is a general phenomenon for any
English text file.
[6/7] from: tbrownell:shaw:ca at: 5-Mar-2002 14:09
Patents have me spooked. I'm turned off by patents and the whole system.
Amazon has a "Buy one now." button patent. Where does it end? And the cost
of submitting and defending is out of my budget. Any VC's out there :)
TB
----- Original Message -----
From: "Ryan Cole" <[ryanc--iesco-dms--com]>
To: <[rebol-list--rebol--com]>
Sent: Tuesday, March 05, 2002 11:09 AM
Subject: [REBOL] Re: Rebol and a new compression system.
> my-comments: decompress #{
> 789C5D54BB72DC3010EBF5153B6ED2E85CF80F3CA9AECD64929A9256A7CDF115
<<quoted lines omitted: 24>>
> > Terry,
> > Sounds great! Would have many uses where text is the medium (html, REBOL
scripts, e-mail, etc).
> >
> > However, I will remain politely skeptical until I see it in action.
> > While I don't have much experience with the mechanics of compression, I
have read some about it.
> > And from that I am heartened by your description of giving better
compression for a specific type of file ("This works only for communication
using standard english words..."). _A_Lot_ of work has been done in this
area, and _many_ claims of a better system have been proven false. But, many
compression schemes are tuned for _any_ type of file (standard zip , PKZIP
WinZip etc), so claims for a better system for a specific type of file are
plausible.
> >
> > I am openly skeptical about your (implied) claim that you can greatly
compress an already compressed file ("If a better compression system than
winzip is used, then the 213 would be even smaller."). While not
impossible, I would like to see if it is a general phenomenon for any
English text file.
[7/7] from: ryanc:iesco-dms at: 5-Mar-2002 16:35
Agreed! A friend of a friend spent a quarter million patending a bicycle lock.
Its those damn patend attorneys that suck up the money, the patend fees are
relatively cheap. If you wanna get super-duper rich, program LFReD to replace
lawyers and attorneys.
--Ryan
Terry Brownell wrote:
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted