[REBOL] Re: What language am I looking for?
From: joel:neely:fedex at: 16-May-2001 23:33
Hi, Brent, Ken, and all...
I like REBOL. REBOL is good.
(Did I mention that I like REBOL?) Having said that...
Given REBOL's compact and expressively powerful notation,
the REBOL implementation is not as fast as those of more
limited notations. It is also the case that REBOL's
notation and implementation contain some subtleties that
make it impossible for me to call it "simple". I believe
these tradeoffs are inescapable, and offer some concrete
examples in support of that belief.
I must emphasize: as a vehicle for exploring programming
ideas, REBOL comes out on the winning side of the trades.
-jn-
Brent Meeker wrote:
> It would seem to be true of all the general purpose
> languages I know of that increased abstraction creates a
> penalty for the interpreter or at compile time. Whether
> it creates a run-time penalty would seem to be a problem
> the compiler could, at least in principle, overcome.
>
Errrrmmm. What compiler? (Or perhaps I should say, "which
compiler, and when?")
Let me be clear that I'm ***NOT*** stepping into the bogus
mind-set that describes some languages as "compiled" and
other languages as "interpreted". Those adjectives describe
properties of the implementation layer, not the notational
layer. It is entirely possible to implement c via an
interpreter, as well as a compiler. It is entirely possible
to implement LISP via a compiler as well as an interpreter.
(Both have been done.)
That said, some features on notation can impose constraints
on what the implementation must deal with. REBOL notation
refuses the conventional code-vs-data distinction, instead
allowing a running REBOL program to construct a block (as
if it were "data") and then immediately DO it (as if it were
executable code
) or use it to construct a function or
object (as if it were "source code"). At the very next
moment, the program may again manipulate the contents of the
block (back to data-like treatment). And so on...
Some strategies for implementing such notational flexibility
are:
0) Interpret the language. Accept the reduced execution
speed as the cost of flexibility.
1) Restrict the notation to a subset that cannot express
such manipulations. Compile that to machine code.
Accept the reduced flexibility as the cost of higher
execution speed.
2) Use a "just-in-time" compilation strategy, in which the
compiler is part of the run-time system. Whenever a
block is first encountered in a code-like usage, compile
it to machine code and thereafter treat each code-like
use of the block as an invocation of that machine code.
However, any data-like mutation of the block voids the
warranty on the compiled representation. Flag the block
as no longer being compiled. Any subsequent code-like
use must first re-invoke the compiler.
Accept the run-time penalty of compilation, as well as
the space-wise and code-wise complexity of putting the
compiler into the run-time system, as the costs of
higher execution speed. Accept the fact that net speed
over some programs (e.g., those that keep manipulating
and evaluating the same block(s) repeatedly) may not
be much of a gain -- if any -- over plain interpretation.
(Note that "data-like mutation of the block" includes any
changes to its content, the content of its content, etc.
Such transitive closures are non-trival to compute! I'll
not pursue this aspect further, at least for now; I just
wanted to point out that there are *HAIRY* details.)
3) Use a probabilistic "just-in-time" compilation strategy,
in which a block is interpreted for the first few code-
like uses. Statistics of such use are kept. After the
number of such uses (or frequency of such uses) passes
some threshold, the run-time system then decides to make
and cache a machine-code version of the block. This has
all of the costs of (2) plus the complexity of the
decision heuristics. It also has the same property that
constant twiddle-and-run activity frustrates the speed-
up mechanisms.
I'm very interested to see if anyone can suggest any others.
Note, again, that I'm talking about implementation strategies,
not properties of the "source code" notation.
Let's see how these alternatives fare on some trade-off
criteria, with the following definitions. By "compact" I
mean both the source code and the size of the run-time
system (interpreter, compiler, libraries, whatever...)
By "powerful" I mean expressive power; others are free to
define their own contexts. ;-) By "simple" I'm referring
to the implementation and run-time system, and *NOT* to the
notation. By "fast" I mean... fast execution.
Option Compact Powerful Simple Fast
------ ------- -------- ------- ---------
(0) yep yep likely nope
(1) sort of less so less so best
(2) nope yep nope sometimes
(3) no way yep NOT! sometimes
Obviously, these are broad-brush statements. However, I
invite anyone who wants to disagree (in qualitative, broad-
brush terms, please!) either to provide another alternative
with compelling arguments why it obviously doesn't fall into
a trade-off, or to show compelling reason why one or more of
the above generalities must be flawed for some case.
> On 16-May-01, Ken Anthony wrote:
...
> >
> > Now I'd like to throw another challenge out to this list.
> >
> > Does being able to be more expressive, through greater
> > levels of abstraction, always have to lead to a
> > performance loss. This is a thorny issue that I've had
> > difficulty grappling with. It's easy to find examples
> > where this is true. I'm looking for examples where this
> > may not be true.
> >
I'd be delighted to see such examples myself. I just haven't
found any (except for highly special circumstances).
Please bear in mind (and pardon me for perhaps belaboring the
obvious) that performance loss is not the only trade-off.
Let me give a specific example.
I've often described REBOL to technically-savvy acquaintances
as being "like LISP without parentheses". The consistent
(obsessive? ;-) use of parentheses in LISP/Scheme makes the
effort of both lexical scanning and pretty-printing almost
nil ( ;-) and essentially the same task. Now consider this:
REBOL has a very rich set of data types. This makes the work
of lexical scanning a bit harder, but still quite feasible.
REBOL is a truly extensible language; you can define words to
be new "control structures" and even define new "defining
words", all within REBOL itself. This makes the work of
writing a fully general pretty-printer formally impossible!
It is not in principle even possible, except during expression
evaluation, to predict how a sequence of REBOL words could
be fully parenthesized in a semantically-transparent way!
Here's a little REBOL parlor game: how many ways can we
insert parentheses into the following block
[a + b c]
and still have a syntactically valid REBOL expression?
Depending on how those three single-letter words are defined,
how many of those parenthesizations are semantically valid
and semantically equivalent to the original block?
I quickly dashed off four; the output below shows them at
work. I'd be interested to see any others.
>> pp-NOT
How many ways can we parenthesize the block
[a + b c]
and how many of those are valid (and equivalent)
depending on how A, B, and C are defined?
Case 1
[a + b c] = [121]
1 [a (+ b c)] [121]
2 [(a + b) c] ERROR
3 [(a + b c)] [121]
4 [a + (b c)] ERROR
Case 2
[a + b c] = [11 100]
1 [a (+ b c)] [1 110]
2 [(a + b) c] [11 100]
3 [(a + b c)] [100]
4 [a + (b c)] [101]
Case 3
[a + b c] = [404]
1 [a (+ b c)] ERROR
2 [(a + b) c] ERROR
3 [(a + b c)] [404]
4 [a + (b c)] ERROR
Case 4
[a + b c] = [903]
1 [a (+ b c)] ERROR
2 [(a + b) c] ERROR
3 [(a + b c)] [903]
4 [a + (b c)] [903]
>>
In case anyone is mildly interested, here is the function
that wrote that little essay.
8<----------------------------------------------------------
pp-NOT: func [
/local a b c code-like
][
code-like: [a + b c]
case-1: does [
a: func [n] [n * n]
b: 1
c: 10
print "Case 1"
]
case-2: does [
a: 1
b: 10
c: 100
print "Case 2"
]
case-3: does [
a: func [:op left right] [
(op (left * left) (right * right))
]
b: 2
c: 20
print "Case 3"
]
case-4: does [
a: 3
b: func [n] [n * n]
c: 30
print "Case 4"
]
test-case: func [n [number!] b [block!] /local rb] [
print [
tab n tab mold b tab
either error? try [
rb: reduce b
][
"ERROR"
][
mold rb
]
]
]
test-cases: does [
print [
tab tab
mold code-like
tab "=" tab
mold reduce code-like
]
test-case 1 [a (+ b c)]
test-case 2 [(a + b) c]
test-case 3 [(a + b c)]
test-case 4 [a + (b c)]
print ""
]
print [
"How many ways can we parenthesize the block" newline
tab mold code-like newline
"and how many of those are valid (and equivalent)" newline
"depending on how A, B, and C are defined?" newline
]
case-1 test-cases
case-2 test-cases
case-3 test-cases
case-4 test-cases
]
8<----------------------------------------------------------
Have fun!
-jn-
--
------------------------------------------------------------
Programming languages: compact, powerful, simple ...
Pick any two!
joel'dot'neely'at'fedex'dot'com