• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r4wp

[Rebol School] REBOL School

Endo
10-Aug-2012
[778x3]
It is what I like about this community :)

I knew that when I write a RLE function, BrianH will come up a much 
better version. Doc and others joined as well and now we have a very 
good function. Just like the CSV tools. Thanks.
Ehm.. what about the decoder? how do I decode unset! values?
I was using somthing like:

decode-rle: func [b /local r] [r: copy [] foreach [x y] b [loop x 
[append r y]]]
decode-rle: func [b /local r i] [

 i: 0  r: make block! foreach [x y] b [i: i + x] ;better for big blocks?
 foreach [x y] b [loop x [append r y]]
]
BrianH
10-Aug-2012
[781x3]
In mezzanine style:

decode-rle: func [
	"Decode a run length encoded block"
	rle [any-block!] "Block of [integer value]"

 /into "Insert into a buffer instead (returns position after insert)"
	output [series!] "The output buffer (modified)"
	/local x
] [
	unless into [
		x: 0  foreach [i v] :rle [x: x + :i]  output: make block! x
	]

 foreach [i v] :rle [output: insert/only/dup :output get/any 'v :i]
	either into [:output] [head :output]
]


Instead of testing for strict format compliance of the input block, 
it uses get-words to keep people from sneaking in functions and then 
passes the length value to + and INSERT/dup, counting on the type 
tests of those functions to do the screening for us.
You're right, having the make block! take the foreach expression 
as a parameter is safe; I forgot that make block! can take none as 
a parameter.
That should work in R3 as well. Though FORSKIP might be faster than 
FOREACH in R3, the simplicity of the code might be worth it.
Maxim
10-Aug-2012
[784]
I like the detail of using  :i   to prevent function hacking.   I 
should use it more often.
BrianH
10-Aug-2012
[785x5]
Trick I picked up when securing the mezzanines. It's slightly faster 
to evaluate too since it does less work.
The reason I use :output there isn't to prevent function hacking, 
it's to prevent converting lit-path! values to the path! type.
Sometimes you want to allow someone to pass in functions and then 
let them evaluate, as long as you have a good semantic model for 
what is supposed to happen and are careful about how you call them. 
The ARRAY, EXTRACT and REPLACE functions in R3 and R2 2.7.7+ are 
a good example of this.
Slightly more optimal version for R3, taking advantage of how get-words 
and get-paths mean GET/any, and how FORSKIP is faster than FOREACH:
decode-rle: func [
	"Decode a run length encoded block"
	rle [any-block!] "Block of [integer value]"

 /into "Insert into a buffer instead (returns position after insert)"
	output [series!] "The output buffer (modified)"
	/local x
] [
	unless into [
		x: 0  output: make block! forskip rle 2 [x: x + :rle/1]
	]
	forskip rle 2 [output: insert/only/dup :output :rle/2 :rle/1]
	either into [:output] [head :output]
]
Maxim
10-Aug-2012
[790]
these would be nice funcs to add to mezz in R3 and R2-forwards
BrianH
10-Aug-2012
[791x2]
Darn, just found a bug in ARRAY for R2 and R3. Litwords are converted 
to words and litpaths are converted to paths.
This is so obscure that I doubt it has affected any existing code 
though.
Arnold
10-Aug-2012
[793x2]
Discovered that d: 1.1.1 then d/1/2: 0 and d:/1/3: 0 and then d/1/2: 
d/1/3: 1 results in d == 1.1.0 ??

This keeps me just inches away from releasing my script before my 
holiday.
Only thing to add is 1 small function to reduce the moves when the 
king is under attack. I discovered some weird VID behaviour too where 
alert boxes have strange formats they inherited from earlier defines 
fields.
Steeve
10-Aug-2012
[795]
An alternative for R3 (strings and blocks)
rle: func [s /local p e o][
	o: copy []
	parse/case s [
		any [

   p: skip any [e: if (p/1 == e/1) skip] (repend o [offset? p e p/1])
		]
	]
	o
]
BrianH
11-Aug-2012
[796x9]
Steeve, that's basically the same as my R2 RLE's block rule, but 
with the IF workaround replaced with IF. It has a few gotchas:
- Executes function values in block data
- Doesn't handle unset! or error! values

- Converts lit-paths to paths and lit-words to words before comparison 
and again before putting in the output.
- Lots of intermediate block creation overhead

- Considers bindings of words when comparing them, not just case-sensitive 
spelling


The first 3 can be handled by using :p/1 and :e/1 instead of p/1 
and e/1, and the fourth by using REDUCE/into instead of REPEND. The 
last one can't be handled by any built-in function or operator in 
R3 (see http://issue.cc/r3/1834for details) but you could do a combination 
of functions and operators to get case-sensitive comparison without 
considering bindings. PARSE/case's QUOTE operation is the fastest 
method for doing that at the moment.


Nice job on neatly bypassing the relaxed bounds checking of R3 blocks. 
Though the if (p/1 == e/1) would succeed if p/1 is none and e is 
at the end of the block, the skip would still fail. That trick saves 
one e: operation.
R3's == operator handles unset and error values better than R2's 
though, which is why the explicit unset! testing in the rule can 
be removed.
Having a strict line of progression for the R3 equalities turned 
out to be a bad idea, since the binding check seems to be tripping 
up case checks.
Unfortunately, R3 development was put on hold before that could be 
fixed.
If you *want* to consider binding when doing your comparisons, perhaps 
for more lossless in-memory compression, then Steeve's IF == method 
is the way to go. If you want true lossless compression then you 
could even use =? to make sure that only runs of exact references 
compress.
The advantages of == or =? comparison over PARSE QUOTE would be lost 
if you serialize the data and save it to a file or send it over a 
network. REBOL syntax doesn't keep track of those distinctions.
The PARSE IF method does let you add a /compare function option though, 
so you can be as specific as you want. Instead of if (:p/1 == :e/1) 
you would do if (apply :f [:p/1 :e/1]) then pass :== or :strict-equal? 
as a parameter..
Here's a version of my last one above, but with Steeve's trick adapted 
to make a /compare option. It defaults to its old case-sensitive 
behavior.

rle: func [
	"Run length encode to series of [length value]"
	s [series!] "The series to encode"

 /into {Insert into a buffer instead (returns position after insert)}
	output [any-block!] "The output buffer (modified)"
	/compare "Comparator function for equvilance"
	comparator [any-function!]
	/local x r qr b e
] [
	unless into [output: make block! 2] x: none
	r: case [
		compare [[any [e: if (apply :comparator [:x :e/1]) skip]]]
		any-string? :s [[any x]]
		'else [qr: copy [quote 1] [(poke qr 2 :x) any qr]
	]
	parse/case :s [any [b: set x skip r e: (
		output: reduce/into [offset? :b :e :x] :output
	)]]
	either into [:output] [head :output]
]
Whoops, forgot a bracket:

rle: func [
	"Run length encode to series of [length value]"
	s [series!] "The series to encode"

 /into {Insert into a buffer instead (returns position after insert)}
	output [any-block!] "The output buffer (modified)"
	/compare "Comparator function for equvilance"
	comparator [any-function!]
	/local x r qr b e
] [
	unless into [output: make block! 2] x: none
	r: case [
		compare [[any [e: if (apply :comparator [:x :e/1]) skip]]]
		any-string? :s [[any x]]
		'else [qr: copy [quote 1] [(poke qr 2 :x) any qr]]
	]
	parse/case :s [any [b: set x skip r e: (
		output: reduce/into [offset? :b :e :x] :output
	)]]
	either into [:output] [head :output]
]
Steeve
11-Aug-2012
[805]
There's no need to store lengths of 1 in the output.
Length of 1 can be infered during decoding.
The compression ratio woulb be better.
BrianH
11-Aug-2012
[806x2]
Length of 1 can't be inferred in the decoding, not for blocks.
>> rle [1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6]
== [1 1 2 2 3 3 4 4 5 5 6 6]
Unless you make it so it treats integers specially. This would slow 
down the encoder and decoder, but reduce the compressed size.
Steeve
11-Aug-2012
[808]
and... you can't compress numbers anymore... forget it
BrianH
11-Aug-2012
[809]
You could reduce the compressed size of a string-specific RLE by 
putting runs of singletons into strings, like this:
>> rle "Hello World!"
== ["He" 2 #"l" "o World!"]
Steeve
11-Aug-2012
[810x2]
LZ77 could replace RLE. It would do RLE + patterns compression
I don't think it would be hard to code with parse
BrianH
11-Aug-2012
[812x2]
Agreed. Not that many repetive runs of characters in string data, 
so a better compression method would be preferable.
RLE is better for image data. Any takers?
Steeve
11-Aug-2012
[814x2]
I've some code when I studied 8bit computer data crunchers
But I didn't use parse at that time
BrianH
11-Aug-2012
[816]
RLE might help for binary data too, including that reduced encoding 
I mentioned for strings above.
Sujoy
13-Aug-2012
[817]
Thanks for that Ladislav (median calculation)
GrahamC
14-Aug-2012
[818x2]
>> do http://reb4.me/r/altjson.r
connecting to: reb4.me
Script: "REBOL <-> JSON" (15-Jul-2011)
>> j: to-json make object! [ b: none ]
== {{"b":null}}
but when you submit this as a JSON parameter ... it fails
Gabriele
14-Aug-2012
[820x2]
Javascript produces the exact same output:

> JSON.stringify({b:null});
'{"b":null}'
so, the problem has to be somewhere else.
GrahamC
14-Aug-2012
[822x3]
well, I can "fix it" by removing the surrounding { }
Just not sure why this is happening
{"b":null} is the actual value
{{"b":null}} is Rebol's way of telling us it's a string
Endo
14-Aug-2012
[825]
BrianH: the last rle function above is for R3?
BrianH
14-Aug-2012
[826]
Yes. It uses the IF and QUOTE operations, SET working on string parsing, 
PARSE default /all, :x meaning GET/any 'x, REDUCE/into, and equality 
finctions handling unset! values.
Kaj
14-Aug-2012
[827]
Hm, that's a lot of differences