Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: to-char

From: joel:neely:fedex at: 14-Feb-2002 9:08

Hi, Jason, Carl, and all... Sorry that the following is so long. I don't have time to write it in fewer words. English, combined with a bag of prior experience with other programming languages, is really a *TERRIBLE* medium for explaining REBOL!!! ;-) As always, I'll be happy for any corrections or useful revisions of this discussion. Carl Read wrote:
> > So now let me ask an even more basic [FA] question: > > A question for you - what's "FA"? (: >
FA as in FAQ
> > What is happening, what is the real meaning of > > > someblock: [] > > An empty block is created, as is the word 'someblock, which > points to (references) the block. >
I must respectfully disagree. The issue of when values are created is an entirely different discussion than the issue of what happens when those values are evaluated. There is not much documentation that separates out these issues, so this puzzle comes up in almost every REBOL programmer's path to enREBOLment. As I understand it, when REBOL *loads* a string of the form foo: <value> (where <value> is a single value such as a number, string, or block) it creates an internal REBOL structure (which can serve as both code and data, depending on how it is subsequently used). In that structure there is a distinct REBOL value for each syntactical element in the source string. However, to know some of the details, we have to know where that string came from. Console input to the interpreter is submitted to a "load-and-do" cycle which takes the input string, loads (translates) it into a REBOL structure, then DOes that structure. Starting with a fresh REBOL process, we can model that process as follows: REBOL/View 1.2.1.3.1 21-Jun-2001 ... more verbiage suppressed ... *** Obtain REBOL/View/Pro from http://www.rebol.com/view-sales.html
>> print foo
** Script Error: foo has no value ** Near: print foo (Just to show that REBOL has no preconceived notion of FOO...)
>> console-input-1: "foo: []"
== "foo: []"
>> length? console-input-1
== 7 The console input is a string of 7 characters.
>> console-struct-1: load console-input-1
== [foo: [] ]
>> length? console-struct-1
== 2 LOADing the console input produces a block containing two values.
>> type? console-struct-1/1
== set-word!
>> type? console-struct-1/2
== block! LOAD created a SET-WORD! value from the string "foo:" and created a block from the string "[]" and put those two values into a new block. The (empty at this point) block doesn't have any more baggage, but there's a hidden issue with the SET-WORD! value. Each REBOL word belongs to a context (known in other languages as an environment or other terms even less useful to us right now! ;-) I think of an environment as a dictionary that pairs the name of a word with a value (reference for some types), but that pairing is relevant only within that context. (If that last phrase is unclear, hang on; we'll try to shed more light on it Real Soon Now.) I think of the internal representation of a word as "containing" a reference to the string that is its name and a reference to its context. NB: THAT IS A DESCRIPTIVE MENTAL MODEL. THERE ARE MANY WAYS THAT THIS COULD ACTUALLY BE IMPLEMENTED; I DON'T KNOW WHICH OF THEM IS/ARE ACTUALLY USED IN THE VARIOUS FLAVORS OF REBOL. In order to create the SET-WORD! value for "foo:", it must have a context. For the above case, since there's nothing to specify otherwise, that will be the global context. So a word value is created with a name of "foo" and a reference to the global context; a new definition is added to the global context containing a word name of "foo" but with no associated value at this point. (AGAIN, THINK OF THIS AS METAPHORICAL.) Finally (whew! this guy is long-winded! ;-) we're ready to talk about the DO step.
>> do console-struct-1
== []
>> print foo >> print mold foo
[] DOing a block requires evaluating each value within the block. When evaluating a SET-WORD! the interpreter does something like the following (AGAIN, METAPHORICAL): 1) put the word in question "on hold"; 2) evaluate the following value/expression (let's not go into that too much right now) which in this case is a reference to a block; 3) evaluating a block (NOT the same as DOing the block!) simply yields a reference to that block; 4) take the word left on hold in (1) and find its context; 5) within that specific context/dictionary, alter the value slot associated with that word, so that now the value slot contains the value (or reference to ... yadda yadda) produced in (3); 6) in addition, the value from (3) now serves as the value for this entire process (in case this evaluation occurred within a larger evaluation -- the issue we skipped over in (2)). At this point, the interpreter would be able throw away a string typed into the console, and the block created from that string, since there are no surviving references to them. In the case of our little modeling exercise, that doesn't happen because we actually have words that are set to the string and block we're playing with. We'll keep them around for a little longer to make a point. Let's model the load-and-do cycle on another string (this time without so much verbiage):
>> console-input-2: "oof: foo"
== "oof: foo"
>> console-struct-2: load console-input-2
== [oof: foo ]
>> do console-struct-2
== [] There's a new word in the global context now. Its value (in that context) is set to refer to the *same* block that (global) FOO is set to. Let's keep modeling the load-and-do cycle...
>> console-input-3: "append oof 1"
== "append oof 1"
>> console-struct-3: load console-input-3
== [append oof 1 ]
>> do console-struct-3
== [1] I'm sure we can all describe that one, and would anticipate the result of cheating and looking at the actual words we're playing with in our model:
>> mold foo
== "[1]"
>> mold oof
== "[1]" However, ("Finally!" you're probably thinking ;-) now I can get to my first punch line. Let's go back and look at our input strings, and then look at the structures that represent those strings in REBOL internal form (with MOLDing and some added whitespace for clarity):
>> foreach thing reduce [
[ console-input-1 console-input-2 console-input-3 [ ][print mold thing] "foo: []" "oof: foo" "append oof 1"
>> foreach thing reduce [
[ console-struct-1 console-struct-2 console-struct-3 [ ][print mold thing] [foo: [1] ] [oof: foo ] [append oof 1 ] What's with the value of CONSOLE-STRUCT-1??? Remember that it originally contained two values -- a set-word! and an empty block (created empty at the time that CONSOLE-INPUT-1 was LOADed). And that's the key. DOing CONSOLE-STRUCT-1 didn't *create* the set-word nor the empty block. They were created when CONSOLE-INPUT-1 was LOADed. All that happened when CONSOLE-STRUCT-1 was DOne was that the value (in the global context) for FOO was set to (a reference to) the block which *already* existed and was referred to in the second position of CONSOLE-STRUCT-1. DOing CONSOLE-STRUCT-2 set (global) OOF (created when we LOADed CONSOLE-INPUT-2) to refer to that same block (still empty at that time). At that point there were three references to that block: the original reference in CONSOLE-STRUCT-1, in the global context for FOO, and in the global context for OOF. The first of those only remained in existence because of our modeling; if we simply typed the console input strings in at the prompt, the older strings and blocks would have already gone back to the recycling plant as soon as the next input was typed. Just to prove that these are three references to the same block, let's cheat on our model. We'll set OOF directly and see the consequences.
>> oof: "no block here!"
== "no block here!"
>> foreach thing reduce [
[ console-struct-1 console-struct-2 console-struct-3 [ ][print mold thing] [foo: [1] ] [oof: foo ] [append oof 1 ] Setting OOF to a new string (created when this new console input was loaded -- outside our model), we simply change the global dictionary definition for OOF to something else. We haven't altered the value with which OOF was associated before that point.
>> oof: append foo "I'm back!"
== [1 "I'm back!"] Now we've reSET the value of OOF to the value of an expression that *also* mutates the value to which FOO is set. Therefore, we now see the effect of that mutation through all references to that same value:
>> foreach thing reduce [
[ console-struct-1 console-struct-2 console-struct-3 [ ][print mold thing] [foo: [1 "I'm back!"] ] [oof: foo ] [append oof 1 ] With all of that in place, let's fast-forward to Carl's comments on functions:
> More on functions: A series is created when the function is > first created and not each time the function is called, which is > why what's in a series will persist from function-call to > function-call unless you specifically clear it or make a copy of > it. Whether to use 'copy or 'clear (or neither for that matter) > will depend on the behaviour you want from the series. >
I agree 100% with what Carl meant, but -- with apologies -- let me try to reword a little bit by continuing our modeling exercise.
>> console-input-4: "trick: func [/local foo] [foo: append [] 1]"
== "trick: func [/local foo] [foo: append [] 1]"
>> console-struct-4: load console-input-4
== [trick: func [/local foo] [foo: append [] 1] ]
>> do console-struct-4
Now there's a global word TRICK which is set to a FUNCTION! value. The SECOND part of a FUNCTION! value is a block -- the "body" of the function.
>> second :trick
== [foo: append [] 1] When was that FUNCTION! value created? When CONSOLE-STRUCT-4 was DOne. When FUNC is applied to two blocks, it constructs a new FUNCTION! value with a process something like this: 1) create a new (empty) context; 2) add to that context every argument and refinement in the first block given to FUNC; 3) make a deep copy of the second block offered to FUNC, but whenever a word appears in that copy that is also in the first block, change the context of the word IN THE COPY to be the context created in (2); 4) create a new FUNCTION! value that is based on the results of (2) and (3), and return that FUNCTION! as the value of (this invocation of) FUNC to whatever caused FUNC to be invoked. The third element in the body of TRICK is (at this moment!) an empty block. It is there because LOAD created an empty block as the third element of the fourth element of CONSOLE-STRUCT-4, and then FUNC copied that empty block at (3) to create the third element of the block that serves as the body of the FUNCTION! created in (4). So, at this moment, the third element of the body of TRICK is an empty block created by copying an empty block created by LOADing a string that contained -- in part -- a #"[" followed by a #"]". To save me some typing and you some reading, let's call that block ~HERBIE~ (the weird punctuation is to remind us that this is only our conversational name for something, it is not REBOL terminology nor notation). Now when we evaluate the (global) TRICK, we get the same behavior that we were discussing earlier:
>> trick
== [1]
>> trick
== [1 1]
>> trick
== [1 1 1] The reason is that -- in the body of TRICK -- the (local to TRICK) word FOO is set to the value of an expression that mutates its first argument, which is ~HERBIE~. ~HERBIE~ started off empty when the function was created. Each time that the function is evaluated, the (local to TRICK) word FOO is set to the value of an expression that modified ~HERBIE~. Since the third element of TRICK's body is a reference to ~HERBIE~, we will see the effects of those mutations when we look at TRICK's body.
>> second :trick
== [foo: append [1 1 1] 1] Since TRICK's body was created by sort-of-copying a block that's still in CONSOLE-STRUCT-4, we *will*not* see the effects of the mutations there.
>> console-struct-4
== [trick: func [/local foo] [foo: append [] 1] ] Since the first element of TRICK's body is a word that has a different context than the global one, all of this SETting of that word has no effect on the global FOO as seen below:
>> foo
== [1 "I'm back!"] We can "tunnel" into that the body of TRICK and see the value of TRICK's local FOO as follows:
>> get first second :trick
== [1 1 1] As with our console input modeling above, there's still a chain of references to ~HERBIE~ so the value of ~HERBIE~ persists. And, since the body of TRICK is just a block, we can do block operations on it:
>> poke second :trick 3 [2]
== [foo: append [2] 1] Now the body of FOO no longer contains a reference to ~HERBIE~ because I POKEd a different value into the place where that reference to ~HERBIE~ used to be. However, there's still another reference to ~HERBIE~
>> get first second :trick
== [1 1 1] But look what happens when I pull another TRICK ...
>> trick
== [2 1]
>> second :trick
== [foo: append [2 1] 1]
>> get first second :trick
== [2 1] I've killed ~HERBIE~ !!! (Good thing he was only a virtual name, or I'd be arrested! ;-) Now both the (local to TRICK) word FOO and the third element of TRICK's body refer to a new block. OBTW, that block was created when I typed the string "poke second :trick 3 [2]" into the console and REBOL LOADed it. It was then mutated when the body of TRICK was evaluated. If you've read this far, you deserve an Olympic medal for the marathon!!! The reason for using COPY in front of a "literal" block inside a REBOL function would be to prevent mutations through a reference to that block from persisting -- i.e. you want a fresh instance of that block's content every time. I know that the above discussion was painful and laborious to read, but I hope it makes clear that there's already some COPYing going on. Knowing *when* the COPYing happens and *what* values are COPYed makes a lot of difference IMHO. As for CLEAR, it simple discards the content of a series, but doesn't replace the series itself.
>> foo
== [1 "I'm back!"]
>> oof
== [1 "I'm back!"]
>> clear foo
== []
>> oof
== [] so that OOF and FOO still both refer to the same series, its just that a (particularly severe!) mutation to that series's value occurred. The consequence of THAT fact is illustrated with these two little tweedles :
>> dee: func [/local foo] [foo: append copy [] 1] >> dum: func [/local foo] [foo: append clear [] 1]
I hope that I've belabored my model to the point that all of the following now make sense to you, most admirable and persistent reader!
>> a: dee
== [1]
>> b: dee
== [1]
>> c: dum
== [1]
>> d: dum
== [1]
>> append a 2
== [1 2]
>> b
== [1]
>> second :dee
== [foo: append copy [] 1]
>> append c 2
== [1 2]
>> d
== [1 2]
>> second :dum
== [foo: append clear [1 2] 1]
>> e: dee
== [1]
>> f: dum
== [1]
>> second :dum
== [foo: append clear [1] 1] Or, if you'll pardon the pun, I hope that it's all CLEAR now! -jn- -- ; sub REBOL {}; sub head ($) {@_[0]} REBOL [] # despam: func [e] [replace replace/all e ":" "." "#" "@"] ; sub despam {my ($e) = @_; $e =~ tr/:#/.@/; return "\n$e"} print head reverse despam "moc:xedef#yleen:leoj" ;