r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3 Proposals] For discussion of feature proposals

Ladislav
13-Jan-2011
[629]
but its still possible to get that information,
 - which one, references are not counted
Maxim
13-Jan-2011
[630]
no but traversing the "heap" (which is what I am guessing he is doing) 
he can count all references to a particular item.  and YES it wil 
be slow.  but for debugging it could be EXTREMELY usefull.  even 
more would be returning all the items which the GC has encountered 
which have references)
Ladislav
13-Jan-2011
[631]
I do not say reference counting *cannot* be done. I say referece 
counting *is not* implemented.
Maxim
13-Jan-2011
[632]
ah ok  ;-)
Ladislav
13-Jan-2011
[633x2]
(I may be wrong, though)
but, taking into account, that it is not needed (the GC does not 
need to know that), it would be a waste of time/code
Gregg
13-Jan-2011
[635]
There are things I would rather see time spent on than the proposed 
INFO? enhancement.
BrianH
13-Jan-2011
[636x3]
REBOL isn't heading towards less error triggering as a rule (though 
there may be less in practice); it is headed towards making errors 
trigger on more useful occasions. The principle is that errors are 
your friends, and if they aren't, we need to reevaluate them on a 
case by case basis. In some cases we have made things trigger an 
error in R3 that didn't in R2, with the hope of overall benefits 
in security, stability and debuggability. We even added ASSERT to 
explicitly trigger errors.
In the case of LENGTH? and INDEX?, we are only allowing them to return 
none when passed none, using none as the equivalent of SQL's unknown, 
not N/A. And even those are a bit iffy - we are only planing to allow 
those to have fewer intermediate none checks when doing none pass-through 
on FIND and SELECT return values. However, the disadvantage is that 
errors aren't triggered close enough to the source of the error. 
Hopefully they will be triggered later by ASSERT calls.
My preference would be to not lose the valuable information about 
which values have length and which shouldn't be passed to LENGTH?. 
We definitely don't want too many nones to be propagated to data 
that doesn't expect them; none leaks are tough to track down, which 
is the whole purpose of the unset! type.
shadwolf
13-Jan-2011
[639]
maxim I vote for your proposal legth? to return none if the argument 
can't be traversed  could be a test on the type of the tested thing 
but arfter ready your  way to present this I'm kinda convinced ... 
As for the GC discussion I very liked the " it doesn't work like 
this but I don't know anything about how it works" isn't that a mutual 
exclusion ?
BrianH
13-Jan-2011
[640x2]
Not really. We can determine from testing the ways that it *doesn't* 
work.
We don't know the exact algorithm, but we know a lot about it, and 
that it's proprietary.
shadwolf
13-Jan-2011
[642]
that's the kind of high level discussion that should conclude by 
a high level documentation about memory management and gc predictions 
... but probably better wasting the oportunity ... once again. I 
would not ask why?
BrianH
13-Jan-2011
[643x3]
GC stuff that we can figure out from experimentation and conversations:
- Mark and sweep
- Not generational or copying
- Probably zoned into block and binary/string zones.
Also, Carl has claimed that it is an algorithm that is unique to 
REBOL, though not gone into details.
Maxim
13-Jan-2011
[646]
I think adding a system like generational would be a big benefit 
to REBOL.  my guess is that Carl has a twist on this idea.
shadwolf
13-Jan-2011
[647]
it's not the first time we have this discussion years ago we had 
this discussion about GC and memory management it's a cyclic concern 
and a cyclic discussion until we don't DC it :) ( Document & Collect 
it)
Maxim
13-Jan-2011
[648]
and because memory isn't released to the OS very often, I can make 
a fair guess that the GC doesn't compact on collect.
shadwolf
13-Jan-2011
[649]
and clearly speaking about guru level stuff and not having a begining 
of a trail on such an issue makes me crazy just because noone wants 
to do this basic effort ...
Maxim
13-Jan-2011
[650]
but the GC can only be speculated to.  so there isn't alot of point 
in documenting assumptions.
shadwolf
13-Jan-2011
[651]
anyway GC thing will tend to be less crucial with the default local 
vars in func no ?
Maxim
13-Jan-2011
[652x2]
unless Carl did a document about it, which would be very nice.  (hint 
hint ... RM-Asset guys?  ;-)
no since the memory allocated by any function still has to be managed 
by some sort of heap, since the data can and often does exist beyond 
the stack.
shadwolf
13-Jan-2011
[654x2]
maxim hum you know you do a splendid documentation full of stupidities 
to make Carl go on verbose mode and correct it that's what it's called 
preach the false to obtain the true :)
that's some inverted spycology ...
Maxim
13-Jan-2011
[656]
spycology... hahaha... typo or not, its a good word.
BrianH
13-Jan-2011
[657]
The main reason that we haven't speculated on it is that it doesn't 
really matter what the real algorithm is, since it's internal and 
can't be swapped. All that matters is externally determinable traits, 
like whether data can move and when. Any moving collector requires 
a lot more effort to work with C libraries and extensions; that includes 
copying and generational collectors.
shadwolf
13-Jan-2011
[658x3]
maxim or not :)
convenient dislexia :)
brianh in fact we had speculate alot of it with steeve and maxim 
around  spring 2009  when we were researching area-tc ... at that 
time i should have done a completly stupid high level documentation 
full of speculations maybe by now we would have better view on this 
particular topic
Maxim
13-Jan-2011
[661x2]
the extensions API actually marshals all of that so the internals 
of the GC don't really affect the host kit that much.  


A part from the raw image! data, we don't really have any reason 
to believe that any pointer we share with the core is persistent.


In any case I don't rely on it, because AFAIK Carl has insinuated 
that we never really have access to the internals and pointers we 
get are actually interfaces, not interal references (for security 
reasons).
the issue is that the GC does affect my code a lot in big apps and 
I'd like to know exactly when and how it works so I can better guess 
when its about to jerk my code in an animation or while I'm scrolling 
stuff.
Ladislav
14-Jan-2011
[663x2]
exactly when and how it works
 - there are at least two reasons why this is what you can't get:


1) *when* the GC works is "unpredictable" from the programmer's POV 
(depending on other code, etc.)

2) It is (or may be) subject to adjustments or changes, without the 
programmer being able to detect such changes, so why should he know?

3) programming for a specific GC variant should be seen as a typical 
bad practice - why should you want to make code that is supposed 
to work only in specific circumstances? Not to mention, that you 
actually cannot do that anyway, since the GC should be programmed 
to hide any implementation details
As to why I know the GC does not count references: I know it, since 
it is documented widely, that reference counting cannot be used for 
garbage collection in general environments.
Andreas
14-Jan-2011
[665x2]
Refcounting w/ cycle detection or or a hybrid refcounting + tracing 
approach certainly can.
But in various bits and pieces of REBOL documentation it's hinted 
at that the REBOL GC being a tracing mark-sweep variant. And I seem 
to recall that Carl affirmed this at least once, but I don't have 
any link to the source handy.
Ladislav
14-Jan-2011
[667]
I think it is safe to consider REBOL GC to be mark&sweep. But am 
totally at odds, how that can help Max write any of his code.
Maxim
14-Jan-2011
[668x2]
this is just helpfull for debugging.   its very usefull to track 
down memory leaks, which are pretty easy to create in GC systems 
since we don't manage any part of it.
also If it has several heaps, or tricks like generations, or whatever 
, we can optimise the code so that we make it work in the "best case" 
of the GC instead of the worst case.


right now all I know is that the GC is very disruptive in very large 
apps (in R2) because you can actually see the application jam for 
a noticeable amount of time when there is animation or interactive 
things being done.
BrianH
14-Jan-2011
[670]
Alas, that kind of problem is exactly why you don't want to rely 
on the implementation model of the GC. One thing we know about the 
GC is that it is the stop the world type. If we need to solve the 
problem you mention, we would have to completely replace the internal 
memory model with a different one and use a GC with a completely 
different algorithm (generational, incremental, parallel, whatever). 
That means that all of your code optimizations would no longer work. 
If you don't try to optimize your code to the GC, it makes it possible 
to optimize the GC itself.
Maxim
14-Jan-2011
[671x3]
BrianH, if I deliver an application to a client and he says to my 
face... why does it jerk every 2 seconds?


I really don't care about if the GC might change.  right now I can't 
do anything to help it.


if it changes, I will adapt my code to it again.   This is platform 
tuning and it is inherently "close to the metal", but in the real 
life, such things are usefull to do... 

just look at the 1 second boot linux machine in that other group.
something like this document (which has a Best Practices at the end) 
would be really nice.


http://vineetgupta.spaces.live.com/blog/cns!8DE4BDC896BEE1AD!1104.entry
obviously, since the GC is a "stop the world" system, I can't fix 
everything, but I might be able to help it along a little bit.
BrianH
14-Jan-2011
[674x2]
You can time when you call it...
I'm hoping that we might get a better collector when we work R3 over 
for concurrency.
Maxim
14-Jan-2011
[676x2]
yes, with proficient concurrency, its much easier to make non-blocking 
GC.
easier being a relative term  ;-)
BrianH
14-Jan-2011
[678]
It might be the other way around. It's harder to stop a multitasking 
world than it is to stop a single-tasking one...