Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: objects without overhead

From: joel:neely:fedex at: 23-Oct-2000 8:56

Hi, Ole, Your questions deal with an area of REBOL which I agree has been VERY much "underdocumented"; I also would love to see a clear, definitive statement from RT about what the semantics were intended to be, and why. While we're waiting for such a response from RT, let me offer a couple of thoughts on my (current!) mental model of REBOL objects which may help explain what's happening and why. (And, as always, I'll be grateful for any responses that expose any misconceptions that I may have...) Ole Friis wrote:
[...snip...]
> > Apparently modifications to the inherited function f in p do not > > propagate to the f function in the parent object o, ergo the two > > functions are independent of each other. > > So that's what the REBOL semantics apparently define. However, if > the _implementation_ of REBOL is clever, those two functions will > refer to the same function until you start modifying one of them. > Then REBOL will split them into two, and modify one of them. > > This is, AFAIR, called "copy-on-write" and is also used in > NewtonScript, the script language that accompanied the Newton > message pad from Apple. This scripting language implementation > had lots of available ROM, but not much RAM, so "copy-on-write" > saved lots of RAM this way (and used some ROM because of the > added complexity of the interpreter code). (BTW, NewtonScript is > prototype-based, just like REBOL.) >
Having used NewtonScript myself, I must interject that there are some very significant differences between the two languages. I would describe NewtonScript as using a "delegation model" for objects -- Walter Smith has explicitly credited Self as being the inspiration for this aspect of NewtonScript. In that model there are no classes (as with REBOL), just objects. THE NEWTONSCRIPT WAY... UNLIKE REBOL, each object can refer to a "parent" object. Any attempt to use a data member or method (note the distinction!) of an object may fail to find the desired, named attribute. If so, the search continues with the "parent" of the original object; this process recurrs up the parent chain until either there are no more parents (in which case there is an error) or the named element is found. If the named attribute is a method, it is executed WITH ALL VARIABLES SUBJECT TO THE SAME ATTRIBUTE LOOKUP AS BEFORE, BUT BEGINNING WITH THE ORIGINAL OBJECT. For example: [I'm using pseudo-REBOL syntax here for the benefit of non-Newton- Script folks on the list but this is NOT valid REBOL code -- nor syntactically correct NewtonScript.] grandma: make object! [ a: 1 b: 1 c: 1 sum: func [] [a + b + c] ] mama: make delegating-object! [ _parent: grandma b: 2 ] child: make delegating-object! [ _parent: mama c: 3 ] grandma/sum == 3 mama/sum == 4 * child/sum == 6 child/a: 10 child/sum == 15 The evaluation of the starred expression is based on the text of SUM, where C resolves to the C in CHILD, B resolves to the B in MAMA, and A resolves to the A in GRANDMA. In the following line, a new attribute is added to CHILD, which now "shadows" the A that would have been "inherited" from the ancestry. Thus the last expression evaluates using A in CHILD, B in MAMA, and C in CHILD. Very dynamic, and very different from REBOL THE REBOL WAY The behavior of REBOL "objects" is different from any standard model with which I am familiar. The difference arises from two key facts about REBOL: 1) Evaluation of a REBOL word requires a "context" which gives meaning to the word.
>> foo: make object! [
[ a: 1 b: 2 f: func [] [a + b] [ g: func [b [block!]] [append b 'a] [ ]
>> fum: make object! [
[ a: 4 b: 3 f: func [] [a - b] [ g: func [b [block!]] [append b 'a] [ ]
>> a: "Hello, "
== "Hello, "
>> b: "world!"
== "world!"
>> f: func [] [print join a b] >> g: func [b [block!]] [append b 'a] >> foo/f
== 3
>> fum/f
== 1
>> f
Hello, world!
>> c: []
== []
>> foo/g c
== [a]
>> fum/g c
== [a a]
>> g c
== [a a a]
>> reduce c
== [1 4 "Hello, "]
>> print c
1 4 Hello, 2) REBOL is a von Neumann Language -- it refuses to distinguish between code and data (note again that this distinction is made in NewtonScript).
>> h: to-paren [print ["eenie" "meenie" "mynie"]]
== (print ["eenie" "meenie" "mynie"])
>> h
eenie meenie mynie
>> append second :h "no mo!"
== ["eenie" "meenie" "mynie" "no mo!"]
>> h
eenie meenie mynie no mo!
>From the first point above, we see that it is impossible to know
how to evaluate a word (even given the spelling of its name!) without knowing the context in which that word is being considered. Further, word! values in REBOL (NOT the spellings of their names, but the internal values actually involved) clearly refer back to their defining contexts -- that's why the value of C only *appears* to contain three mentions of the same word, when it actually mentions three distinct words whose names are simply spelled alike!)
>From the second point above, we see that there's no way to know --
even in principle -- whether a REBOL value is code or data, because to REBOL that is a distinction only of usage, not of nature. With those two facts firmly in mind, now consider how to understand what's happening in the following expansion of your example:
>> a: make object! [t: "aaa" speak: func [] [print t]] >> b: make a [] >> c: make a [] >> same? b/t c/t
== false Why are they not the same? In order for us to be able to do this:
>> b/t: "bbb"
== "bbb"
>> c/t: "ccc"
== "ccc"
>> a/speak
aaa
>> b/speak
bbb
>> c/speak
ccc We need to be able to alter the value of a word within a specific object and have subsequent evaluations in that object's context use the new value -- without confusing it with another word in another object EVEN IF THE NAMES ARE SPELLED THE SAME. You might still be wondering, "But why not copy-on-write?" Well, consider first the following:
>> t: "ttt"
== "ttt" * >> a/speak: func [] [print t]
>> a/speak
ttt
>> a/t
== "aaa" the T in the new value of A/SPEAK is no longer the same as the T inside of A. At the point of creation of a function, each name appearing in the body of the function must be converted to a word! value, and (as we have already seen) each (internal) word! value identifies its own context. The starred line immediately above requires REBOL to create a function body from the typed-in "source code", so REBOL "looks up" the name {t} and finds only the global one, because I was typing at the interpreter's command prompt. Therefore, THAT is the T that is used in the function, and not the T from A. [I consider my use of quotations in the preceding paragraph to be necessary, as I am using non-standard terminology to describe REBOL behavior that I've not been able to find described in official REBOL documentation with official REBOL terminology.] Therefore, when we said
>> b: make a []
earlier, REBOL had to make a context for B, then make a new T in that context, then make a new SPEAK in that context. But in making that new SPEAK ***the mention of {t} inside of the body of SPEAK had to refer to the new T inside B and not the T of A. If a copy-on-write strategy tried to make SPEAK of B refer to the same function! value as that of SPEAK of A (only changing when SPEAK of B is redefined), words inside B/SPEAK couldn't refer to the words of B, but of A. Incidentally, this also explains en passant the reason why function bodies are distinct from the blocks which were used to create them (i.e., why the function body is a distinct copy of the block):
>> body: [print "Hello, world!"]
== [print "Hello, world!"]
>> sayhi: func [] body >> sayhi
Hello, world!
>> change next body "Hi, mom!"
== []
>> body
== [print "Hi, mom!"]
>> sayhi
Hello, world!
>> source sayhi
sayhi: func [][print "Hello, world!"] Changing BODY no longer changes the behavior of SAYHI because words used in SAYHI must be understood in the context in which SAYHI was defined. The fastest way to do this is to construct a new body for SAYHI, using the block supplied to FUNC as "inspiration" rather than as a value to which the new function can simply refer. As one final, more extreme set of examples, consider:
>> body: [print [a + b + c]]
== [print [a + b + c]]
>> a: b: c: 1
== 1
>> f: func [] body >> f
3
>> obj0: make object! [
[ a: b: c: 2 [ f: func [] body [ ]
>> obj0/f
3 The names inside BODY already had been translated to internal word! values, which were defined in the global context, therefore those were the words used inside OBJ0/F (a surprise!)
>> bodystring: "print [a + b + c]"
== "print [a + b + c]"
>> obj1: make object! [
[ a: b: c: 3 [ f: func [] to-block bodystring [ ]
>> obj1/f
** Script Error: print is not defined in this context. ** Where: print [a + b + c]
>> probe obj1
make object! [ a: 3 b: 3 c: 3 f: func [][print [a + b + c]] ] Obviously (?) the process of converting names to word! values is not defined (or currently implemented, at least!) to expand its lookup outside the context immediate at hand. Thus, the name {print} was *assumed* to refer to a word! in the context of OBJ1 -- presumably with the intent of allowing forward references, but that's just speculation on my part -- and was not understood as mention of the global PRINT function. Many more such obscure cases can be constructed. Absent any RT documentation of intent, it is impossible to determine whether any of these are bugs.
> - Why was REBOL designed this way (as I don't see any benefits of > doing it that way, as I don't see memory overhead as a benefit) >
Again, absent of any RT documentation, it is impossible to tell whether this is 1) Intentional, according to some undocumented goals for how objects should behave; 2) A logically-unavoidable consequence of some undocumented goals for how functions, contexts, and words should behave; 3) An unplanned consequence of how functions, contexts, and words are currently implemented (i.e., could have been differently done, with subtly different behavior); or 4) Some combination of the above. The most troubling implication of #3 is that a future version of REBOL might have to make a choice between perpetuating legacy behavior even in the face of a need to change, or breaking user- written code because an implementation detail does change.
> - _Does_ the REBOL interpreter actually use "copy-on-write", >
No, AFAICT.
> or should we get used to writing object-oriented REBOL programs > in obscure ways to avoid memory and speed penalties (the latter > because the values in the prototype object has to be copied > somehow, and this takes time)? >
I hope this little essay has provided at least a conceptual model for REBOL objects. I certainly don't claim that it explains the actual thought processes that were involved in the design and evolution of REBOL.
> The above two questions are intentionally written in a > provocative way to, well, provoke REBOL Tech. to answer them :-) >
As were my comments in this little essay. However, for about a year I've been writing my attempts at conceptual models for REBOL behavior (as much to clarify my own thinking, as to try to help fellow REBOL programmers). To the best of my knowledge, none of them have provoked a reply from RT stating either "you got it right" or "you got it wrong and here's what you should have said" or even "you got it wrong." Hey, RT, please understand that I'm not trying to offend anyone, nor cast unfair criticism! What I *am* trying to do is understand REBOL and help explain it to myself and my fellow programmers. That is (in part) my way of trying to help REBOL (and RT) succeed. I hope that you'll take my comments in that spirit, and confirm or correct my attempts at explanation in like spirit. -jn-