World: r3wp
[!REBOL3 Proposals] For discussion of feature proposals
older newer | first last |
Maxim 13-Jan-2011 [627x2] | anyhow... we are a bit off topic from my original post. I now understand why GC is very slow on large datasets. |
but its still possible to get that information, maybe it could be a refinement, as well as one for MEM stats (if its also heavy to resolve) | |
Ladislav 13-Jan-2011 [629] | but its still possible to get that information, - which one, references are not counted |
Maxim 13-Jan-2011 [630] | no but traversing the "heap" (which is what I am guessing he is doing) he can count all references to a particular item. and YES it wil be slow. but for debugging it could be EXTREMELY usefull. even more would be returning all the items which the GC has encountered which have references) |
Ladislav 13-Jan-2011 [631] | I do not say reference counting *cannot* be done. I say referece counting *is not* implemented. |
Maxim 13-Jan-2011 [632] | ah ok ;-) |
Ladislav 13-Jan-2011 [633x2] | (I may be wrong, though) |
but, taking into account, that it is not needed (the GC does not need to know that), it would be a waste of time/code | |
Gregg 13-Jan-2011 [635] | There are things I would rather see time spent on than the proposed INFO? enhancement. |
BrianH 13-Jan-2011 [636x3] | REBOL isn't heading towards less error triggering as a rule (though there may be less in practice); it is headed towards making errors trigger on more useful occasions. The principle is that errors are your friends, and if they aren't, we need to reevaluate them on a case by case basis. In some cases we have made things trigger an error in R3 that didn't in R2, with the hope of overall benefits in security, stability and debuggability. We even added ASSERT to explicitly trigger errors. |
In the case of LENGTH? and INDEX?, we are only allowing them to return none when passed none, using none as the equivalent of SQL's unknown, not N/A. And even those are a bit iffy - we are only planing to allow those to have fewer intermediate none checks when doing none pass-through on FIND and SELECT return values. However, the disadvantage is that errors aren't triggered close enough to the source of the error. Hopefully they will be triggered later by ASSERT calls. | |
My preference would be to not lose the valuable information about which values have length and which shouldn't be passed to LENGTH?. We definitely don't want too many nones to be propagated to data that doesn't expect them; none leaks are tough to track down, which is the whole purpose of the unset! type. | |
shadwolf 13-Jan-2011 [639] | maxim I vote for your proposal legth? to return none if the argument can't be traversed could be a test on the type of the tested thing but arfter ready your way to present this I'm kinda convinced ... As for the GC discussion I very liked the " it doesn't work like this but I don't know anything about how it works" isn't that a mutual exclusion ? |
BrianH 13-Jan-2011 [640x2] | Not really. We can determine from testing the ways that it *doesn't* work. |
We don't know the exact algorithm, but we know a lot about it, and that it's proprietary. | |
shadwolf 13-Jan-2011 [642] | that's the kind of high level discussion that should conclude by a high level documentation about memory management and gc predictions ... but probably better wasting the oportunity ... once again. I would not ask why? |
BrianH 13-Jan-2011 [643x3] | GC stuff that we can figure out from experimentation and conversations: |
- Mark and sweep - Not generational or copying - Probably zoned into block and binary/string zones. | |
Also, Carl has claimed that it is an algorithm that is unique to REBOL, though not gone into details. | |
Maxim 13-Jan-2011 [646] | I think adding a system like generational would be a big benefit to REBOL. my guess is that Carl has a twist on this idea. |
shadwolf 13-Jan-2011 [647] | it's not the first time we have this discussion years ago we had this discussion about GC and memory management it's a cyclic concern and a cyclic discussion until we don't DC it :) ( Document & Collect it) |
Maxim 13-Jan-2011 [648] | and because memory isn't released to the OS very often, I can make a fair guess that the GC doesn't compact on collect. |
shadwolf 13-Jan-2011 [649] | and clearly speaking about guru level stuff and not having a begining of a trail on such an issue makes me crazy just because noone wants to do this basic effort ... |
Maxim 13-Jan-2011 [650] | but the GC can only be speculated to. so there isn't alot of point in documenting assumptions. |
shadwolf 13-Jan-2011 [651] | anyway GC thing will tend to be less crucial with the default local vars in func no ? |
Maxim 13-Jan-2011 [652x2] | unless Carl did a document about it, which would be very nice. (hint hint ... RM-Asset guys? ;-) |
no since the memory allocated by any function still has to be managed by some sort of heap, since the data can and often does exist beyond the stack. | |
shadwolf 13-Jan-2011 [654x2] | maxim hum you know you do a splendid documentation full of stupidities to make Carl go on verbose mode and correct it that's what it's called preach the false to obtain the true :) |
that's some inverted spycology ... | |
Maxim 13-Jan-2011 [656] | spycology... hahaha... typo or not, its a good word. |
BrianH 13-Jan-2011 [657] | The main reason that we haven't speculated on it is that it doesn't really matter what the real algorithm is, since it's internal and can't be swapped. All that matters is externally determinable traits, like whether data can move and when. Any moving collector requires a lot more effort to work with C libraries and extensions; that includes copying and generational collectors. |
shadwolf 13-Jan-2011 [658x3] | maxim or not :) |
convenient dislexia :) | |
brianh in fact we had speculate alot of it with steeve and maxim around spring 2009 when we were researching area-tc ... at that time i should have done a completly stupid high level documentation full of speculations maybe by now we would have better view on this particular topic | |
Maxim 13-Jan-2011 [661x2] | the extensions API actually marshals all of that so the internals of the GC don't really affect the host kit that much. A part from the raw image! data, we don't really have any reason to believe that any pointer we share with the core is persistent. In any case I don't rely on it, because AFAIK Carl has insinuated that we never really have access to the internals and pointers we get are actually interfaces, not interal references (for security reasons). |
the issue is that the GC does affect my code a lot in big apps and I'd like to know exactly when and how it works so I can better guess when its about to jerk my code in an animation or while I'm scrolling stuff. | |
Ladislav 14-Jan-2011 [663x2] | exactly when and how it works - there are at least two reasons why this is what you can't get: 1) *when* the GC works is "unpredictable" from the programmer's POV (depending on other code, etc.) 2) It is (or may be) subject to adjustments or changes, without the programmer being able to detect such changes, so why should he know? 3) programming for a specific GC variant should be seen as a typical bad practice - why should you want to make code that is supposed to work only in specific circumstances? Not to mention, that you actually cannot do that anyway, since the GC should be programmed to hide any implementation details |
As to why I know the GC does not count references: I know it, since it is documented widely, that reference counting cannot be used for garbage collection in general environments. | |
Andreas 14-Jan-2011 [665x2] | Refcounting w/ cycle detection or or a hybrid refcounting + tracing approach certainly can. |
But in various bits and pieces of REBOL documentation it's hinted at that the REBOL GC being a tracing mark-sweep variant. And I seem to recall that Carl affirmed this at least once, but I don't have any link to the source handy. | |
Ladislav 14-Jan-2011 [667] | I think it is safe to consider REBOL GC to be mark&sweep. But am totally at odds, how that can help Max write any of his code. |
Maxim 14-Jan-2011 [668x2] | this is just helpfull for debugging. its very usefull to track down memory leaks, which are pretty easy to create in GC systems since we don't manage any part of it. |
also If it has several heaps, or tricks like generations, or whatever , we can optimise the code so that we make it work in the "best case" of the GC instead of the worst case. right now all I know is that the GC is very disruptive in very large apps (in R2) because you can actually see the application jam for a noticeable amount of time when there is animation or interactive things being done. | |
BrianH 14-Jan-2011 [670] | Alas, that kind of problem is exactly why you don't want to rely on the implementation model of the GC. One thing we know about the GC is that it is the stop the world type. If we need to solve the problem you mention, we would have to completely replace the internal memory model with a different one and use a GC with a completely different algorithm (generational, incremental, parallel, whatever). That means that all of your code optimizations would no longer work. If you don't try to optimize your code to the GC, it makes it possible to optimize the GC itself. |
Maxim 14-Jan-2011 [671x3] | BrianH, if I deliver an application to a client and he says to my face... why does it jerk every 2 seconds? I really don't care about if the GC might change. right now I can't do anything to help it. if it changes, I will adapt my code to it again. This is platform tuning and it is inherently "close to the metal", but in the real life, such things are usefull to do... just look at the 1 second boot linux machine in that other group. |
something like this document (which has a Best Practices at the end) would be really nice. http://vineetgupta.spaces.live.com/blog/cns!8DE4BDC896BEE1AD!1104.entry | |
obviously, since the GC is a "stop the world" system, I can't fix everything, but I might be able to help it along a little bit. | |
BrianH 14-Jan-2011 [674x2] | You can time when you call it... |
I'm hoping that we might get a better collector when we work R3 over for concurrency. | |
Maxim 14-Jan-2011 [676] | yes, with proficient concurrency, its much easier to make non-blocking GC. |
older newer | first last |