More gz compression

[1/5] from: henrik:webz:dk at: 22-Jul-2002 4:02

Hi Having followed the thread on compression, I found that rebol uses gzip. Fine, I figured. PHP supports that too. I'm trying to post compressed data via http to a php page using rebol's internal compression scheme, which php should decompress and send into a database, thus having rebol access a remote sql database that way. Did anyone make rebol's compress match the one used in gzcompress() with php? It seems they produce slightly different binaries. I've tried different compression levels with php, but rebol barfs on the data sent from php to rebol when trying to decompress it. I haven't tried it the other way yet. -- Regards, Henrik Mikael Kristensen

[2/5] from: g:santilli:tiscalinet:it at: 25-Jul-2002 16:13

Hi Henrik, On Monday, July 22, 2002, 4:02:01 AM, you wrote: HMK> Did anyone make rebol's compress match the one used in gzcompress() with php? I never tried, but maybe PHP's just adding the GZIP header that REBOL does not add? Regards, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

[3/5] from: henrik:webz:dk at: 31-Jul-2002 3:57

On Thursday 25 July 2002 16:13, you wrote:

> Hi Henrik, > > On Monday, July 22, 2002, 4:02:01 AM, you wrote: > > HMK> Did anyone make rebol's compress match the one used in gzcompress() > with php? > > I never tried, but maybe PHP's just adding the GZIP header that > REBOL does not add?

It doesn't. At least thats what it says in the PHP documentation. I've done some futher investigation and it turns out that rebol can indeed decompress some strings with specific characters. I believe the compression algorithm used in php and the one in rebol treats certain characters differently. I haven't been able to determine which characters. All I can see is that the compressed data from PHP and Rebol aren't exactly the same in some cases (I used a piece of compressed php source code which fails to decompress in rebol). The length is the same, but using rebol's checksum command gives different results for each compressed binary. A failed decompression gives: ** Script Error: Not enough memory ** Where: halt-view ** Near: decompress <variable name> I tried all compression levels in PHP, and found that level 6 comes closest to what rebol will decompress. I also found out that rebol pads the length of the original string as a 32 bit number on the end of the compressed data. PHP doesn't do that. I guess I'll leave it alone for a while since I can't find a stable solution... -- Regards Henrik Mikael Kristensen

[4/5] from: g:santilli:tiscalinet:it at: 31-Jul-2002 11:40

Hi Henrik, On Wednesday, July 31, 2002, 3:57:43 AM, you wrote: HMK> I've done some futher investigation and it turns out that rebol can indeed HMK> decompress some strings with specific characters. I believe the compression HMK> algorithm used in php and the one in rebol treats certain characters HMK> differently. I haven't been able to determine which characters. Maybe it's just a matter of what version of zlib REBOL is using? (Provided they are really using it instead of having written their own clone to save space. :) Holger? Are you reading? Regards, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

[5/5] from: oliva:david:seznam:cz at: 2-Aug-2002 12:33

Hello Gabriele, Thursday, July 25, 2002, 4:13:23 PM, you wrote: GS> Hi Henrik, GS> On Monday, July 22, 2002, 4:02:01 AM, you wrote: HMK>> Did anyone make rebol's compress match the one used in gzcompress() with php? GS> I never tried, but maybe PHP's just adding the GZIP header that GS> REBOL does not add? GS> Regards, GS> Gabriele. I've tried it as well and here is result: Rebol produces same output as PHP function: gzcompress($data_to_decompress, 9); but with the uncompressed data size (last 4 bytes) (at least in all my tests there was same result and I was able to compress/decompress data between PHP and Rebol vice versa) For reading/writing *.gz files from Rebol we need to work with the GZIP header... I've written this testing gzip parser - this doesn't solve our problem - see the comments in the file and try to find solution:-) <code> rebol [ title: "GZip test parser" purpose: {Experimental script to test a posibility to decompress *.gz files} author: "Oldes" email: [oliva--david--seznam--cz] specification: http://www.faqs.org/rfcs/rfc1952.html ] ;gzipbin: read/binary %/c/web/fwp/www/test2.gz ;for testing I have already loaded compressed text: "Rebol" gzipbin: #{1F8B080000000000000B0B4A4DCACF0100D2293B6505000000} rebolbin: compress "Rebol" ;comparing these binaries we can see: ;gzipbin: #{1F8B080000000000000B 0B4A4DCACF0100 D2293B65 05000000} ;rebolbin: #{ 789C 0B4A4DCACF0100 05A301F5 05000000} parse/all gzipbin [ ;IDentification: #{1F8B} ;compression method: copy CM 1 skip (probe CM: to-binary CM) ;FLaGs: copy FLG 1 skip (probe FLG: enbase/base FLG 2) ;Modification TIME: copy MTIME 4 skip (probe MTIME: to-integer head reverse to-binary MTIME) ;eXtra FLags: copy XFL 1 skip (probe XFL: to-binary XFL) ;Operating System: copy OS 1 skip (probe OS: to-integer to-binary OS) tmp: ( if FLG/6 = #"1" [ ;eXtra LENgth XLEN: to-integer head reverse to-binary copy/part tmp 2 tmp: skip tmp 2 FEXTRA: copy/part tmp XLEN tmp: skip tmp XLEN ] if FLG/5 = #"1" [ ;...original file name, zero-terminated... parse/all tmp [copy FNAME to #{00} 1 skip tmp: to end] ] if FLG/4 = #"1" [ ;...file comment, zero-terminated... parse/all tmp [copy FCOMMENT to #{00} 1 skip tmp: to end] ] ) :tmp copy rest to end ( probe rest: to-binary rest ;in the REST variable is zlib compressed data 4bytes of CRC32 and 4bytes of size of uncompressed data... ;Can anybody find how to uncompress these data in Rebol??? ;if I insert #{789C} in the head (as it's in the 'rebolbin I get error: ;** Script Error: Invalid compressed data - problem: -3 ;that probably means wrong CRC32 - :(( - RT will have to help us ) ] </code>