World: r3wp

Join the discussions in the REBOL3 world...

[!REBOL3 Proposals] For discussion of feature proposals

older newer	first last
Ladislav 28-Jan-2011 [933x2]	right now the GC is very cumbersome. it waits for it to have 3-5MB before working. and it can take a noticeable amount of time to do when there is a lot of ram. I've had it freeze for a second in some apps. - what exactly does the GC have in common with the "Deduplicate issue"?
Ladislav 28-Jan-2011 [933x2]	I demonstrated above, that, in fact, nothing.
BrianH 28-Jan-2011 [935]	But that doesn't mean that deallocating immediately will be any more efficient; likely it won't.
Ladislav 28-Jan-2011 [936]	This is all just pretending, if, what is needed, is a kind of incremental/generational/whichever other GC variant, then no "Deduplicate" can help with that
BrianH 28-Jan-2011 [937x2]	We don't need DEDUPLICATE to help with the GC. He was suggesting that having it be native would help reduce the pressure on the GC when used for other reasons instead of a mezzanine version. I don't think it will by much.
BrianH 28-Jan-2011 [937x2]	He needs DEDUPLICATE for his own code. The GC also needs work, but that is another issue :)
Maxim 28-Jan-2011 [939]	if we implement deduplicate as a mezz, we are juggling data which invariably tampers the GC. doing this native, helps to prevent the GC from working to hard. the problem is not how long/fast the allocation/deallocation is... its the fact that cramming data for the GC to manage, will make the GC trigger longer/more often.
BrianH 28-Jan-2011 [940]	But you would need to deallocate every time the function is called, instead of just when the GC is called. This is often slower.
Ladislav 28-Jan-2011 [941]	Did you read Carl's note to the subject?
BrianH 28-Jan-2011 [942]	The subject of the set function implementation, or the GC implementation and how it compares to direct deallocation? If the latter then no.
Ladislav 28-Jan-2011 [943]	I meant the note more to Max, and it was about the set function
BrianH 28-Jan-2011 [944]	Ah, cool :)
Ladislav 28-Jan-2011 [945]	The GC is not a slow approach to the garbage collection. The main problem is, that it is "unpredictable", and possibly producing delays, when other processing stops. (but that does not mean, that immediate collection would be faster)
Maxim 28-Jan-2011 [946]	yes I did.
Ladislav 28-Jan-2011 [947]	the "stop the world" approach is disturbing for user interface, which might need a different type of GC...
BrianH 28-Jan-2011 [948]	Also, multitasking could require a different kind of GC. Any thoughts on this, Ladislav?
Maxim 28-Jan-2011 [949x2]	just adding a generational system to the GC would help a lot. I've read that some systems also use reference counting and mark and sweep together to provide better performance on data which is highly subject to volatility.
Maxim 28-Jan-2011 [949x2]	though I guess it requires a bit more structured code than rebol to properly predict what is expected to be volatile.
Ashley 28-Jan-2011 [951]	re DEDUPLICATE ... it's not just GUI code that would benefit, DB code often needs to apply this on the result set. "buffer: unique buffer" is a common idiom, which can be problematic with very large datasets.
BrianH 28-Jan-2011 [952]	You still need an intermediate dupe even for DEDUPLICATE. This just makes it so the argument is modified, in case there are multiple references to it.
Ladislav 29-Jan-2011 [953]	it's not just GUI code that would benefit, DB code often needs to apply this on the result set. buffer: unique buffer" is a common idiom, which can be problematic with very large datasets" - that is exactly where I was trying to explain it was just a superstition - buffer: unique buffer is as memory hungry as you can get any Deduplicate is just pretentding it does not happen, when, in fact, that is not true
Ashley 29-Jan-2011 [954]	Does DEDUPLICATE have to create a new series. How inefficient would it be to SORT the series then REMOVE duplicates starting from the TAIL. Inefficient as a mezz, but as native?
Ladislav 29-Jan-2011 [955]	Deduplicate does not have to use auxiliary data, if the goal is to use an inefficient algorithm. But, in that case, there's no need to have it as a native.
Maxim 29-Jan-2011 [956]	even if done without aux data, it will still be MUCH faster to do in native.
Ladislav 29-Jan-2011 [957x2]	No
Ladislav 29-Jan-2011 [957x2]	But, as Ashley pointed out, if there's no need to keep the original order, it can be done in place easily
Maxim 29-Jan-2011 [959]	Lad, just looping without doing anything is slow in REBOL.
Ladislav 29-Jan-2011 [960]	That does not prove your point
Maxim 29-Jan-2011 [961]	the average test is that things done in extensions are at least 10 times faster, and Carl has shown a few examples which where 30 x faster. really Lad, there is no comparison.
Ladislav 29-Jan-2011 [962]	Still missing the argument
Maxim 29-Jan-2011 [963]	you can elaborate if you want... I'm just passing thru, in between two jobs in the renovations... will be back later.
Ladislav 29-Jan-2011 [964x4]	To find out what is wrong, just write an "in place" version of Deduplicate in Rebol, divide the time needed to deduplicate a 300 element series by 30, and compare to the algorithm (in Rebol again) allowed to use auxiliary data.
	You shall find out, that the fact, that the algorithm is native, does not help.
	Or, to make it even easier, just use an "in place deduplicate" written in Rebol, divide the time to deduplicate a 300 element series by 30, and compare to the time Unique takes (Unique uses aux data, i.e. a more efficient algorithm)
	You shall find out, that the difference made by an inappropriate algorithm is so huge, that even as a native it would be too slow compared to an efficient algorithm written in Rebol
Maxim 29-Jan-2011 [968]	yes, I was not comparing two different algorithms... but the same algo done native vs interpreted.
Ladislav 29-Jan-2011 [969]	I am curious who do you think would agree to write that nonsense
Oldes 30-Jan-2011 [970]	You talk about is so much that someone could write an extension in the same time and give a real prove:) What I can say, using additional serie is a big speed enancement. At least it was when I was doing colorizer.
Ladislav 30-Jan-2011 [971x2]	You talk about is so much that someone could write an extension in the same time and give a real prove:) What I can say, using additional serie is a big speed enancement. - actually, it has been proven already, just look at the performance of the UNIQUE, etc. functions
Ladislav 30-Jan-2011 [971x2]	That is why I do not understand, why it is so hard to understand.
BrianH 31-Jan-2011 [973x2]	ALIAS function removed in the next version; see http://issue.cc/r3/1163 http://issue.cc/r3/1164 http://issue.cc/r3/1165and http://issue.cc/r3/1835 for details. Also, http://issue.cc/r3/1212dismissed as unnecessary now.
BrianH 31-Jan-2011 [973x2]	R3 still uses case aliasing internally for word case insensitivity, but there is no other aliasing.
Maxim 31-Jan-2011 [975x2]	unfortunately, in the extensions, case insensitivity doesn't work.
Maxim 31-Jan-2011 [975x2]	(unless its been fixed in the very latest versions)
BrianH 31-Jan-2011 [977x2]	Do you mean in object contexts, or the words block?
BrianH 31-Jan-2011 [977x2]	Either way, report it.
Maxim 31-Jan-2011 [979x2]	word blocks for sure, didn't test on objects.
Maxim 31-Jan-2011 [979x2]	I will, next time I play on the hostkit. right now I coudnt' provide example code which demonstrates it.
BrianH 31-Jan-2011 [981]	A report like "Words are not case insensitive in extension word blocks" would help a lot. Carl has been bug fixing lately, and that includes extension bugs.
Maxim 31-Jan-2011 [982]	ok... will add that to begin.
older newer	first last