[REBOL] Re: objects without overhead
From: joel:neely:fedex at: 23-Oct-2000 8:56
Hi, Ole,
Your questions deal with an area of REBOL which I agree has been
VERY much "underdocumented"; I also would love to see a clear,
definitive statement from RT about what the semantics were
intended to be, and why.
While we're waiting for such a response from RT, let me offer
a couple of thoughts on my (current!) mental model of REBOL
objects which may help explain what's happening and why. (And,
as always, I'll be grateful for any responses that expose any
misconceptions that I may have...)
Ole Friis wrote:
[...snip...]
> > Apparently modifications to the inherited function f in p do not
> > propagate to the f function in the parent object o, ergo the two
> > functions are independent of each other.
>
> So that's what the REBOL semantics apparently define. However, if
> the _implementation_ of REBOL is clever, those two functions will
> refer to the same function until you start modifying one of them.
> Then REBOL will split them into two, and modify one of them.
>
> This is, AFAIR, called "copy-on-write" and is also used in
> NewtonScript, the script language that accompanied the Newton
> message pad from Apple. This scripting language implementation
> had lots of available ROM, but not much RAM, so "copy-on-write"
> saved lots of RAM this way (and used some ROM because of the
> added complexity of the interpreter code). (BTW, NewtonScript is
> prototype-based, just like REBOL.)
>
Having used NewtonScript myself, I must interject that there are
some very significant differences between the two languages. I
would describe NewtonScript as using a "delegation model" for
objects -- Walter Smith has explicitly credited Self as being
the inspiration for this aspect of NewtonScript. In that model
there are no classes (as with REBOL), just objects.
THE NEWTONSCRIPT WAY...
UNLIKE REBOL, each object can refer to a "parent" object.
Any attempt to use a data member or method (note the distinction!)
of an object may fail to find the desired, named attribute. If
so, the search continues with the "parent" of the original object;
this process recurrs up the parent chain until either there are no
more parents (in which case there is an error) or the named element
is found. If the named attribute is a method, it is executed WITH
ALL VARIABLES SUBJECT TO THE SAME ATTRIBUTE LOOKUP AS BEFORE, BUT
BEGINNING WITH THE ORIGINAL OBJECT.
For example:
[I'm using pseudo-REBOL syntax here for the benefit of non-Newton-
Script folks on the list but this is NOT valid REBOL code -- nor
syntactically correct NewtonScript.]
grandma: make object! [
a: 1 b: 1 c: 1
sum: func [] [a + b + c]
]
mama: make delegating-object! [
_parent: grandma
b: 2
]
child: make delegating-object! [
_parent: mama
c: 3
]
grandma/sum == 3
mama/sum == 4
* child/sum == 6
child/a: 10
child/sum == 15
The evaluation of the starred expression is based on the text of
SUM, where C resolves to the C in CHILD, B resolves to the B in
MAMA, and A resolves to the A in GRANDMA. In the following line,
a new attribute is added to CHILD, which now "shadows" the A that
would have been "inherited" from the ancestry. Thus the last
expression evaluates using A in CHILD, B in MAMA, and C in CHILD.
Very dynamic, and very different from REBOL
THE REBOL WAY
The behavior of REBOL "objects" is different from any standard
model with which I am familiar. The difference arises from two
key facts about REBOL:
1) Evaluation of a REBOL word requires a "context" which gives
meaning to the word.
>> foo: make object! [
[ a: 1 b: 2 f: func [] [a + b]
[ g: func [b [block!]] [append b 'a]
[ ]
>> fum: make object! [
[ a: 4 b: 3 f: func [] [a - b]
[ g: func [b [block!]] [append b 'a]
[ ]
>> a: "Hello, "
== "Hello, "
>> b: "world!"
== "world!"
>> f: func [] [print join a b]
>> g: func [b [block!]] [append b 'a]
>> foo/f
== 3
>> fum/f
== 1
>> f
Hello, world!
>> c: []
== []
>> foo/g c
== [a]
>> fum/g c
== [a a]
>> g c
== [a a a]
>> reduce c
== [1 4 "Hello, "]
>> print c
1 4 Hello,
2) REBOL is a von Neumann Language -- it refuses to distinguish
between code and data (note again that this distinction is
made in NewtonScript).
>> h: to-paren [print ["eenie" "meenie" "mynie"]]
== (print ["eenie" "meenie" "mynie"])
>> h
eenie meenie mynie
>> append second :h "no mo!"
== ["eenie" "meenie" "mynie" "no mo!"]
>> h
eenie meenie mynie no mo!
>From the first point above, we see that it is impossible to know
how to evaluate a word (even given the spelling of its name!)
without knowing the context in which that word is being considered.
Further, word! values in REBOL (NOT the spellings of their names,
but the internal values actually involved) clearly refer back to
their defining contexts -- that's why the value of C only *appears*
to contain three mentions of the same word, when it actually
mentions three distinct words whose names are simply spelled alike!)
>From the second point above, we see that there's no way to know --
even in principle -- whether a REBOL value is code or data, because
to REBOL that is a distinction only of usage, not of nature.
With those two facts firmly in mind, now consider how to understand
what's happening in the following expansion of your example:
>> a: make object! [t: "aaa" speak: func [] [print t]]
>> b: make a []
>> c: make a []
>> same? b/t c/t
== false
Why are they not the same? In order for us to be able to do this:
>> b/t: "bbb"
== "bbb"
>> c/t: "ccc"
== "ccc"
>> a/speak
aaa
>> b/speak
bbb
>> c/speak
ccc
We need to be able to alter the value of a word within a specific
object and have subsequent evaluations in that object's context use
the new value -- without confusing it with another word in another
object EVEN IF THE NAMES ARE SPELLED THE SAME.
You might still be wondering, "But why not copy-on-write?" Well,
consider first the following:
>> t: "ttt"
== "ttt"
* >> a/speak: func [] [print t]
>> a/speak
ttt
>> a/t
== "aaa"
the T in the new value of A/SPEAK is no longer the same as the T
inside of A. At the point of creation of a function, each name
appearing in the body of the function must be converted to a word!
value, and (as we have already seen) each (internal) word! value
identifies its own context.
The starred line immediately above requires REBOL to create a
function body from the typed-in "source code", so REBOL "looks
up" the name {t} and finds only the global one, because I was
typing at the interpreter's command prompt. Therefore, THAT
is the T that is used in the function, and not the T from A.
[I consider my use of quotations in the preceding paragraph to
be necessary, as I am using non-standard terminology to
describe REBOL behavior that I've not been able to find described
in official REBOL documentation with official REBOL terminology.]
Therefore, when we said
>> b: make a []
earlier, REBOL had to make a context for B, then make a new T in
that context, then make a new SPEAK in that context. But in
making that new SPEAK ***the mention of {t} inside of the body
of SPEAK had to refer to the new T inside B and not the T of A.
If a copy-on-write strategy tried to make SPEAK of B refer to the
same function! value as that of SPEAK of A (only changing when SPEAK
of B is redefined), words inside B/SPEAK couldn't refer to the words
of B, but of A.
Incidentally, this also explains en passant the reason why function
bodies are distinct from the blocks which were used to create them
(i.e., why the function body is a distinct copy of the block):
>> body: [print "Hello, world!"]
== [print "Hello, world!"]
>> sayhi: func [] body
>> sayhi
Hello, world!
>> change next body "Hi, mom!"
== []
>> body
== [print "Hi, mom!"]
>> sayhi
Hello, world!
>> source sayhi
sayhi: func [][print "Hello, world!"]
Changing BODY no longer changes the behavior of SAYHI because words
used in SAYHI must be understood in the context in which SAYHI was
defined. The fastest way to do this is to construct a new body
for SAYHI, using the block supplied to FUNC as "inspiration" rather
than as a value to which the new function can simply refer.
As one final, more extreme set of examples, consider:
>> body: [print [a + b + c]]
== [print [a + b + c]]
>> a: b: c: 1
== 1
>> f: func [] body
>> f
3
>> obj0: make object! [
[ a: b: c: 2
[ f: func [] body
[ ]
>> obj0/f
3
The names inside BODY already had been translated to internal word!
values, which were defined in the global context, therefore those
were the words used inside OBJ0/F (a surprise!)
>> bodystring: "print [a + b + c]"
== "print [a + b + c]"
>> obj1: make object! [
[ a: b: c: 3
[ f: func [] to-block bodystring
[ ]
>> obj1/f
** Script Error: print is not defined in this context.
** Where: print [a + b + c]
>> probe obj1
make object! [
a: 3
b: 3
c: 3
f: func [][print [a + b + c]]
]
Obviously (?) the process of converting names to word! values is
not defined (or currently implemented, at least!) to expand its
lookup outside the context immediate at hand. Thus, the name
{print} was *assumed* to refer to a word! in the context of
OBJ1 -- presumably with the intent of allowing forward references,
but that's just speculation on my part -- and was not understood
as mention of the global PRINT function.
Many more such obscure cases can be constructed. Absent any RT
documentation of intent, it is impossible to determine whether
any of these are bugs.
> - Why was REBOL designed this way (as I don't see any benefits of
> doing it that way, as I don't see memory overhead as a benefit)
>
Again, absent of any RT documentation, it is impossible to tell
whether this is
1) Intentional, according to some undocumented goals for how
objects should behave;
2) A logically-unavoidable consequence of some undocumented goals
for how functions, contexts, and words should behave;
3) An unplanned consequence of how functions, contexts, and words
are currently implemented (i.e., could have been differently
done, with subtly different behavior); or
4) Some combination of the above.
The most troubling implication of #3 is that a future version of
REBOL might have to make a choice between perpetuating legacy
behavior even in the face of a need to change, or breaking user-
written code because an implementation detail does change.
> - _Does_ the REBOL interpreter actually use "copy-on-write",
>
No, AFAICT.
> or should we get used to writing object-oriented REBOL programs
> in obscure ways to avoid memory and speed penalties (the latter
> because the values in the prototype object has to be copied
> somehow, and this takes time)?
>
I hope this little essay has provided at least a conceptual model
for REBOL objects. I certainly don't claim that it explains the
actual thought processes that were involved in the design and
evolution of REBOL.
> The above two questions are intentionally written in a
> provocative way to, well, provoke REBOL Tech. to answer them :-)
>
As were my comments in this little essay.
However, for about a year I've been writing my attempts at
conceptual models for REBOL behavior (as much to clarify my own
thinking, as to try to help fellow REBOL programmers).
To the best of my knowledge, none of them have provoked a reply from
RT stating either "you got it right" or "you got it wrong and here's
what you should have said" or even "you got it wrong."
Hey, RT, please understand that I'm not trying to offend anyone, nor
cast unfair criticism! What I *am* trying to do is understand REBOL
and help explain it to myself and my fellow programmers. That is
(in part) my way of trying to help REBOL (and RT) succeed. I hope
that you'll take my comments in that spirit, and confirm or correct
my attempts at explanation in like spirit.
-jn-