More gz compression
[1/5] from: henrik:webz:dk at: 22-Jul-2002 4:02
Hi
Having followed the thread on compression, I found that rebol uses gzip. Fine,
I figured. PHP supports that too.
I'm trying to post compressed data via http to a php page using rebol's
internal compression scheme, which php should decompress and send into a
database, thus having rebol access a remote sql database that way.
Did anyone make rebol's compress match the one used in gzcompress() with php?
It seems they produce slightly different binaries. I've tried different
compression levels with php, but rebol barfs on the data sent from php to
rebol when trying to decompress it.
I haven't tried it the other way yet.
--
Regards,
Henrik Mikael Kristensen
[2/5] from: g:santilli:tiscalinet:it at: 25-Jul-2002 16:13
Hi Henrik,
On Monday, July 22, 2002, 4:02:01 AM, you wrote:
HMK> Did anyone make rebol's compress match the one used in gzcompress() with php?
I never tried, but maybe PHP's just adding the GZIP header that
REBOL does not add?
Regards,
Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r
[3/5] from: henrik:webz:dk at: 31-Jul-2002 3:57
On Thursday 25 July 2002 16:13, you wrote:
> Hi Henrik,
>
> On Monday, July 22, 2002, 4:02:01 AM, you wrote:
>
> HMK> Did anyone make rebol's compress match the one used in gzcompress()
> with php?
>
> I never tried, but maybe PHP's just adding the GZIP header that
> REBOL does not add?
It doesn't. At least thats what it says in the PHP documentation.
I've done some futher investigation and it turns out that rebol can indeed
decompress some strings with specific characters. I believe the compression
algorithm used in php and the one in rebol treats certain characters
differently. I haven't been able to determine which characters.
All I can see is that the compressed data from PHP and Rebol aren't exactly
the same in some cases (I used a piece of compressed php source code which
fails to decompress in rebol). The length is the same, but using rebol's
checksum command gives different results for each compressed binary.
A failed decompression gives:
** Script Error: Not enough memory
** Where: halt-view
** Near: decompress <variable name>
I tried all compression levels in PHP, and found that level 6 comes closest to
what rebol will decompress.
I also found out that rebol pads the length of the original string as a 32 bit
number on the end of the compressed data. PHP doesn't do that.
I guess I'll leave it alone for a while since I can't find a stable
solution...
--
Regards
Henrik Mikael Kristensen
[4/5] from: g:santilli:tiscalinet:it at: 31-Jul-2002 11:40
Hi Henrik,
On Wednesday, July 31, 2002, 3:57:43 AM, you wrote:
HMK> I've done some futher investigation and it turns out that rebol can indeed
HMK> decompress some strings with specific characters. I believe the compression
HMK> algorithm used in php and the one in rebol treats certain characters
HMK> differently. I haven't been able to determine which characters.
Maybe it's just a matter of what version of zlib REBOL is using?
(Provided they are really using it instead of having written their
own clone to save space. :)
Holger? Are you reading?
Regards,
Gabriele.
--
Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer
Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r
[5/5] from: oliva:david:seznam:cz at: 2-Aug-2002 12:33
Hello Gabriele,
Thursday, July 25, 2002, 4:13:23 PM, you wrote:
GS> Hi Henrik,
GS> On Monday, July 22, 2002, 4:02:01 AM, you wrote:
HMK>> Did anyone make rebol's compress match the one used in gzcompress() with php?
GS> I never tried, but maybe PHP's just adding the GZIP header that
GS> REBOL does not add?
GS> Regards,
GS> Gabriele.
I've tried it as well and here is result:
Rebol produces same output as PHP function:
gzcompress($data_to_decompress, 9);
but with the uncompressed data size (last 4 bytes)
(at least in all my tests there was same result and I was able to
compress/decompress data between PHP and Rebol vice versa)
For reading/writing *.gz files from Rebol we need to work with the GZIP header...
I've written this testing gzip parser - this doesn't solve our problem - see
the comments in the file and try to find solution:-)
<code>
rebol [
title: "GZip test parser"
purpose: {Experimental script to test a posibility to decompress *.gz files}
author: "Oldes"
email: [oliva--david--seznam--cz]
specification: http://www.faqs.org/rfcs/rfc1952.html
]
;gzipbin: read/binary %/c/web/fwp/www/test2.gz
;for testing I have already loaded compressed text: "Rebol"
gzipbin: #{1F8B080000000000000B0B4A4DCACF0100D2293B6505000000}
rebolbin: compress "Rebol"
;comparing these binaries we can see:
;gzipbin: #{1F8B080000000000000B 0B4A4DCACF0100 D2293B65 05000000}
;rebolbin: #{ 789C 0B4A4DCACF0100 05A301F5 05000000}
parse/all gzipbin [
;IDentification:
#{1F8B}
;compression method:
copy CM 1 skip (probe CM: to-binary CM)
;FLaGs:
copy FLG 1 skip (probe FLG: enbase/base FLG 2)
;Modification TIME:
copy MTIME 4 skip (probe MTIME: to-integer head reverse to-binary MTIME)
;eXtra FLags:
copy XFL 1 skip (probe XFL: to-binary XFL)
;Operating System:
copy OS 1 skip (probe OS: to-integer to-binary OS)
tmp:
(
if FLG/6 = #"1" [
;eXtra LENgth
XLEN: to-integer head reverse to-binary copy/part tmp 2
tmp: skip tmp 2
FEXTRA: copy/part tmp XLEN
tmp: skip tmp XLEN
]
if FLG/5 = #"1" [
;...original file name, zero-terminated...
parse/all tmp [copy FNAME to #{00} 1 skip tmp: to end]
]
if FLG/4 = #"1" [
;...file comment, zero-terminated...
parse/all tmp [copy FCOMMENT to #{00} 1 skip tmp: to end]
]
)
:tmp
copy rest to end
(
probe rest: to-binary rest
;in the REST variable is zlib compressed data 4bytes of CRC32 and 4bytes of size of uncompressed
data...
;Can anybody find how to uncompress these data in Rebol???
;if I insert #{789C} in the head (as it's in the 'rebolbin I get error:
;** Script Error: Invalid compressed data - problem: -3
;that probably means wrong CRC32 - :(( - RT will have to help us
)
]
</code>