Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: What language am I looking for?

From: joel:neely:fedex at: 16-May-2001 23:33

Hi, Brent, Ken, and all... I like REBOL. REBOL is good. (Did I mention that I like REBOL?) Having said that... Given REBOL's compact and expressively powerful notation, the REBOL implementation is not as fast as those of more limited notations. It is also the case that REBOL's notation and implementation contain some subtleties that make it impossible for me to call it "simple". I believe these tradeoffs are inescapable, and offer some concrete examples in support of that belief. I must emphasize: as a vehicle for exploring programming ideas, REBOL comes out on the winning side of the trades. -jn- Brent Meeker wrote:
> It would seem to be true of all the general purpose > languages I know of that increased abstraction creates a > penalty for the interpreter or at compile time. Whether > it creates a run-time penalty would seem to be a problem > the compiler could, at least in principle, overcome. >
Errrrmmm. What compiler? (Or perhaps I should say, "which compiler, and when?") Let me be clear that I'm ***NOT*** stepping into the bogus mind-set that describes some languages as "compiled" and other languages as "interpreted". Those adjectives describe properties of the implementation layer, not the notational layer. It is entirely possible to implement c via an interpreter, as well as a compiler. It is entirely possible to implement LISP via a compiler as well as an interpreter. (Both have been done.) That said, some features on notation can impose constraints on what the implementation must deal with. REBOL notation refuses the conventional code-vs-data distinction, instead allowing a running REBOL program to construct a block (as if it were "data") and then immediately DO it (as if it were executable code ) or use it to construct a function or object (as if it were "source code"). At the very next moment, the program may again manipulate the contents of the block (back to data-like treatment). And so on... Some strategies for implementing such notational flexibility are: 0) Interpret the language. Accept the reduced execution speed as the cost of flexibility. 1) Restrict the notation to a subset that cannot express such manipulations. Compile that to machine code. Accept the reduced flexibility as the cost of higher execution speed. 2) Use a "just-in-time" compilation strategy, in which the compiler is part of the run-time system. Whenever a block is first encountered in a code-like usage, compile it to machine code and thereafter treat each code-like use of the block as an invocation of that machine code. However, any data-like mutation of the block voids the warranty on the compiled representation. Flag the block as no longer being compiled. Any subsequent code-like use must first re-invoke the compiler. Accept the run-time penalty of compilation, as well as the space-wise and code-wise complexity of putting the compiler into the run-time system, as the costs of higher execution speed. Accept the fact that net speed over some programs (e.g., those that keep manipulating and evaluating the same block(s) repeatedly) may not be much of a gain -- if any -- over plain interpretation. (Note that "data-like mutation of the block" includes any changes to its content, the content of its content, etc. Such transitive closures are non-trival to compute! I'll not pursue this aspect further, at least for now; I just wanted to point out that there are *HAIRY* details.) 3) Use a probabilistic "just-in-time" compilation strategy, in which a block is interpreted for the first few code- like uses. Statistics of such use are kept. After the number of such uses (or frequency of such uses) passes some threshold, the run-time system then decides to make and cache a machine-code version of the block. This has all of the costs of (2) plus the complexity of the decision heuristics. It also has the same property that constant twiddle-and-run activity frustrates the speed- up mechanisms. I'm very interested to see if anyone can suggest any others. Note, again, that I'm talking about implementation strategies, not properties of the "source code" notation. Let's see how these alternatives fare on some trade-off criteria, with the following definitions. By "compact" I mean both the source code and the size of the run-time system (interpreter, compiler, libraries, whatever...) By "powerful" I mean expressive power; others are free to define their own contexts. ;-) By "simple" I'm referring to the implementation and run-time system, and *NOT* to the notation. By "fast" I mean... fast execution. Option Compact Powerful Simple Fast ------ ------- -------- ------- --------- (0) yep yep likely nope (1) sort of less so less so best (2) nope yep nope sometimes (3) no way yep NOT! sometimes Obviously, these are broad-brush statements. However, I invite anyone who wants to disagree (in qualitative, broad- brush terms, please!) either to provide another alternative with compelling arguments why it obviously doesn't fall into a trade-off, or to show compelling reason why one or more of the above generalities must be flawed for some case.
> On 16-May-01, Ken Anthony wrote:
...
> > > > Now I'd like to throw another challenge out to this list. > > > > Does being able to be more expressive, through greater > > levels of abstraction, always have to lead to a > > performance loss. This is a thorny issue that I've had > > difficulty grappling with. It's easy to find examples > > where this is true. I'm looking for examples where this > > may not be true. > >
I'd be delighted to see such examples myself. I just haven't found any (except for highly special circumstances). Please bear in mind (and pardon me for perhaps belaboring the obvious) that performance loss is not the only trade-off. Let me give a specific example. I've often described REBOL to technically-savvy acquaintances as being "like LISP without parentheses". The consistent (obsessive? ;-) use of parentheses in LISP/Scheme makes the effort of both lexical scanning and pretty-printing almost nil ( ;-) and essentially the same task. Now consider this: REBOL has a very rich set of data types. This makes the work of lexical scanning a bit harder, but still quite feasible. REBOL is a truly extensible language; you can define words to be new "control structures" and even define new "defining words", all within REBOL itself. This makes the work of writing a fully general pretty-printer formally impossible! It is not in principle even possible, except during expression evaluation, to predict how a sequence of REBOL words could be fully parenthesized in a semantically-transparent way! Here's a little REBOL parlor game: how many ways can we insert parentheses into the following block [a + b c] and still have a syntactically valid REBOL expression? Depending on how those three single-letter words are defined, how many of those parenthesizations are semantically valid and semantically equivalent to the original block? I quickly dashed off four; the output below shows them at work. I'd be interested to see any others.
>> pp-NOT
How many ways can we parenthesize the block [a + b c] and how many of those are valid (and equivalent) depending on how A, B, and C are defined? Case 1 [a + b c] = [121] 1 [a (+ b c)] [121] 2 [(a + b) c] ERROR 3 [(a + b c)] [121] 4 [a + (b c)] ERROR Case 2 [a + b c] = [11 100] 1 [a (+ b c)] [1 110] 2 [(a + b) c] [11 100] 3 [(a + b c)] [100] 4 [a + (b c)] [101] Case 3 [a + b c] = [404] 1 [a (+ b c)] ERROR 2 [(a + b) c] ERROR 3 [(a + b c)] [404] 4 [a + (b c)] ERROR Case 4 [a + b c] = [903] 1 [a (+ b c)] ERROR 2 [(a + b) c] ERROR 3 [(a + b c)] [903] 4 [a + (b c)] [903]
>>
In case anyone is mildly interested, here is the function that wrote that little essay. 8<---------------------------------------------------------- pp-NOT: func [ /local a b c code-like ][ code-like: [a + b c] case-1: does [ a: func [n] [n * n] b: 1 c: 10 print "Case 1" ] case-2: does [ a: 1 b: 10 c: 100 print "Case 2" ] case-3: does [ a: func [:op left right] [ (op (left * left) (right * right)) ] b: 2 c: 20 print "Case 3" ] case-4: does [ a: 3 b: func [n] [n * n] c: 30 print "Case 4" ] test-case: func [n [number!] b [block!] /local rb] [ print [ tab n tab mold b tab either error? try [ rb: reduce b ][ "ERROR" ][ mold rb ] ] ] test-cases: does [ print [ tab tab mold code-like tab "=" tab mold reduce code-like ] test-case 1 [a (+ b c)] test-case 2 [(a + b) c] test-case 3 [(a + b c)] test-case 4 [a + (b c)] print "" ] print [ "How many ways can we parenthesize the block" newline tab mold code-like newline "and how many of those are valid (and equivalent)" newline "depending on how A, B, and C are defined?" newline ] case-1 test-cases case-2 test-cases case-3 test-cases case-4 test-cases ] 8<---------------------------------------------------------- Have fun! -jn- -- ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com