r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[!REBOL3]

BrianH
4-Mar-2010
[1227x5]
Andreas, I don't have a problem with that solution in principle. 
It's just that it wouldn't work, and wouldn't be task-safe. The handlers 
for those functions would be task-local, the code blocks not. Plus 
it would break code that uses code block references rather than nested 
blocks, code that uses those functions through function values, and 
any function with the [throw] attribute (which we will be getting 
back in R3 with different syntax), and all of those exist in R3 mezzanine 
code. Plus there's all the extra BIND/copy overhead added to every 
call to loop functions, startup code, etc., and don't think that 
you won't notice that because that can double the memory usage and 
executiion time, at least.


The solution I proposed in the ticket comments is to have DO, CATCH 
and the loops set a task-local flag in the interpreter state when 
the relevant functions become valid, and unset it when they become 
invalid, then have the functions check the flag at runtime before 
they do their work (which they could because they're all native). 
This would be task-safe, only add a byte of task-local memory overhead, 
plus the execution overhead of setting and getting bits in that byte 
in a task-local way. It's the execution overhead that we don't know 
about, whether it would be too much. It would certainly be less than 
your proposal though.
Carl is the authority on subtle implementation overhead, but for 
gross implementation overhead anyone can tell by just using the profiling 
tools and extraploating. And what you are proposing is definitely 
in the gross overhead category.
However, CATCH/name and THROW/name would need the additional memory 
overhead of a single block of words per task in the dynamic solution 
to store the currently handled names.
It might be hard to believe, but R3 has gotten so efficient that 
BIND/copy overhead is really noticeable now in comparison. In R2 
there were mezzanine loop functions like FORALL and FORSKIP that 
people often avoided using in favor of natives, even reorganizing 
their algorithms to allow using different loop functions like FOREACH 
or WHILE. Now that all loop functions in R3 are native speed, the 
FORALL and FORSKIP functions are preferred over FOREACH or FOR sometimes 
because FOREACH and FOR have BIND/copy overhead, and FORALL and FORSKIP 
don't. The functions without the BIND/copy overhead are much faster, 
particularly for small datasets and large amounts of code.
It's funny: While regular R3 code looks a lot like regular R2 code, 
optimized code looks a lot different because the balance of what 
is fast and what isn't has shifted. At least regular R3 code looks 
a lot more like optimized R3 code than regular R2 code looks like 
optimized R2 code. This is because we have been focusing on making 
the common, naive code patterns more optimized in R3, so that people 
don't have to do as much hand-optimization. The goal is to make it 
so that only writers of mezzanine and library code need to hand-optimize, 
and regular app developers can just use the optimized code without 
worrying about such things.
Andreas
4-Mar-2010
[1232x3]
Brian, please notice that I am talking about two things in the past 
few messages. I separated those discussions with "---".
The first is the proposal for a change of semantics, which I'm mainly 
interested in as a though experiment.
thought*
BrianH
4-Mar-2010
[1235]
Ah, cool. Glad to continue the thought experiment then :)
Andreas
4-Mar-2010
[1236x4]
Great :)
But actually I wanted to leave that experiment for now.
After the "---" I discussed the overhead of the solution you proposed 
on the bug tracker.
And if you re-read that, you will notice that it's precisely what 
you later describe.
BrianH
4-Mar-2010
[1240x2]
Yup.
Except the THROW/name block-of-words thing.
Andreas
4-Mar-2010
[1242x3]
Precisely.
The overhead of which would be more noticeable, but not too severe. 
Some simple heuristics should do fine.
As words are interned anyway, you only need an array of integers 
to store the names.
BrianH
4-Mar-2010
[1245]
Btw, non-local code blocks are a common optimization trick in mezzanine 
code, one which shows up a lot in Carl's code. It's probably the 
reason why REBOL supports the concept in the first place. And I've 
written code in REPLACE that uses the BREAK function as a function 
value, though I haven't checked whether other people use this trick 
:)
Andreas
4-Mar-2010
[1246x2]
Let's finish the performance discussion firs t:)
Typical code will have only very few distinct named catches.
BrianH
4-Mar-2010
[1248]
That's for sure - I haven't seen it yet in mezzanine code.
Andreas
4-Mar-2010
[1249]
So I think the vast majority of cases can be handled by very efficient 
code.
BrianH
4-Mar-2010
[1250]
It seems so. I've asked Carl to look at those tickets and chime in, 
so we'll see what he thinks.
Andreas
4-Mar-2010
[1251x2]
Great, I'd really like to see this improved, even if it's only a 
rare corner case.
That said, we can go back to the thought experiment, if you like 
:)
BrianH
4-Mar-2010
[1253]
If you want to see how weird really optimized R3 code can get, take 
a look at the source of LOAD and IMPORT - they are probably the most 
heavily optimized mezzanine functions. For the most part the rest 
of the mezzanine code is written for readability and maintainability, 
and the language optimized to make readable code fast. It's a good 
tradeoff :)
Andreas
4-Mar-2010
[1254]
I'm well aware of the value of foreign code blocks as such. The interesting 
question, I guess, is how often foreign code is used without re-binding 
it.
BrianH
4-Mar-2010
[1255]
Most of the time, actually, otherwise the BIND/copy overhead would 
make it a poor optimization.
Andreas
4-Mar-2010
[1256x4]
Optimization is only one use case, though.
Do you have a succinct example of such a use for optimization purposes?
The nice BREAK trick used in your REPLACE would mostly be unaffected 
by this change, for example.
You'd just use the function value of the (not globally bound) function 
implementing break.
BrianH
4-Mar-2010
[1260x4]
IMPORT uses code blocks as a way of reusing duplicate code, though 
it might not be affected either. And REPLACE would be affected because 
'break wouldn't be bound at the point it is used: Being in a function 
isn't enough, it's outside of the loop. BREAK is used to break out 
of loops, not functions.
That means that the BIND/copy overhead for BREAK and CONTINUE would 
happen at every call to a loop function, not just FOR, FOREACH and 
REPEAT. And 'break and 'continue would become keywords rather than 
function names, unable to be used for loop-local variables.
LOOP, WHILE, FORALL and FORSKIP don't currently have BIND/copy overhead. 
Which is why they are used a lot in R3 :)
Sorry, I don't mean to go on about that.
Andreas
4-Mar-2010
[1264x2]
Huh?
I certainly enjoyed the discussion, then :)
BrianH
4-Mar-2010
[1266]
I don't have the time now to provide examples, I'm afraid, must run 
an errand. Try it yourself and see what you find out :)
Andreas
4-Mar-2010
[1267]
Regarding the BREAK usage in REPLACE. You currently have:
    do-break: unless all [:break]
I think that would just become:
    do-break: unless all [:system/contexts/system/break]
(Or wherever the BREAK function would be stored.)
BrianH
4-Mar-2010
[1268]
Right, though somewhere else.
Andreas
4-Mar-2010
[1269]
The added BIND/copy overhead for loop functions currently not needing 
to BIND their body is certainly true. FOREVER would be another one 
of those.
BrianH
4-Mar-2010
[1270]
REMOVE-EACH and MAP-EACH already have the BIND/copy overhead though.
Andreas
4-Mar-2010
[1271x2]
As do most other loop functions, I guess (FOREACH, FOR, REPEAT, etc.).
Only verified it for FOREACH yesterday and assumed that the other 
loop functions that need binding would also copy.
BrianH
4-Mar-2010
[1273x3]
Yup. But not all loop functions need binding, only the ones with 
a lit-word argument with a doc string that says "will be local" or 
some such.
Ah, REPEAT doesn't say that. I should submit a documentation bug 
report.
But all the loop functions that take a word argument have BIND/copy 
overhead, and the rest don't.
Andreas
4-Mar-2010
[1276]
Binding: foreach, repeat, remove-each, map-each
Not binding: forever, loop, while, until, forall, forskip