[REBOL] A prototype typechecker
From: jan:skibinski:sympatico:ca at: 5-Nov-2002 23:00
Hi all,
I've been working very hard in the past week or so
on my prototype of a typechecker for Rebol.
Since I do not currently maintain my web site,
and since I do not want to pollute the library
by yet another script of mine, I decided to send this
l-o-o-n-g note to the mailing list. I hope you
forgive me for this. I should be ready with the
prototype in a few days, but since I got really tired
of all that work I'd like to describe the outline
of its features and ask for some feedback.
Regards,
Jan
A typechecking session
----------------------
>> check [foldl]
== (foldl | (a -+ b -+ a) -+ a -+ [b] -+ a)
What's that? I am asking my prototype typechecker
to check a type-correctness of the expression
within the block. I am starting with a sole function
'foldl. Its functionality will be explained later.
For a moment let us concentrate on the types only.
The typechecker says: Function 'foldl has a type
shown on the right hand side of the pipe sign.
The type, or a signature of this function uses two
generic type variables 'a and 'b: each corresponding to
any-type!. The signature sets a usage pattern, which
will become more specialized down the road.
The -+ symbol pretends to be an arrow separator.
There are four elements in this signature: the
first three types describe the types of arguments
the function 'foldl will accept, and the last type
is its return type.
The first type on the left represents a generic
function which accepts two arguments 'a and 'b,
and returns a result of type 'b. In particular,
'a and 'b can be of the same type.
Let's us think of a function of such type, which
could be accepted by the foldl. The :add function
would do, but - to make things more interesting -
I will use instead my home brewed version 'sum.
Back to the typechecker:
>> check [foldl :sum]
== (foldl :sum | ring -+ [ring] -+ ring)
What's now? The typechecker says that the stuff on
the left of the pipe sign is a new function,
whose type is on the right. It will accept an
element of type 'ring and a series of elements
of type 'ring and it will produce a 'ring.
The signature of the :sum itself can be checked
separately:
>> check [sum]
== (sum | ring -+ ring -+ ring)
The 'sum has been accepted as an argument to
'foldl because its type is a subset of the
generic pattern advertised by 'foldl:
(a -+ b -+ a)
As a result the type specialization took place
and the resulting type is much simple now.
But what is a 'ring? This is an alias for a set
of types, each having a property of a mathematical
ring: with all its standard rules. This stuff
is defined in the global block 'typesets:
>> typesets/ring
== (number money tuple pair)
Please note that the ring includes neither
the 'date nor the 'time: you cannot add or
multiply two dates or types together.
Using the typeset names instead of their explicit
values has several advantages - none of the least
is the space saver.
Back to the typechecker: adding next argument ...
>> check [foldl :sum 0]
== (foldl :sum 0 | [integer] -+ integer)
Now we can see even greater specialization
in action: the resulting function accepts a series
of integers and produces a single integer.
Finally, let us provide the last missing argument,
the expected series of integers:
>> check [foldl :sum 0 [1 2 3 4]]
== (foldl :sum 0 [1 2 3 4] | integer)
Now we can safely execute the block expression
>> do [foldl :sum 0 [1 2 3 4]]
== 56
and obtain a sum of [1 2 3 4].
Without any detailed explanation 'foldl is
a generalized accumulator, which uses a binary
operator - operating on pairs of elements:
one precomputed and one taken from the series.
At this point I could have taken you through few
other pages to demonstrate how the typechecker
detects the wrong types, how to get into debugging
mode (check/debug), how to show step by step
reductions (check/step), etc. - but I'll spare you
the pain for now.
Instead, I will shortly discuss few key
design factors and proposals.
============================================
The existing Rebol's system of datatypes
is not good enough to support the scheme just
outlined. It needs extra support for functions.
The generic any-function!, function!, etc. types
are not specific enough. The typechecker needs
patterns: which can be either very generic, like
the 'foldl pattern, or very specific, like
a 'poly pattern:
>> check [poly]
== (poly | [number] -+ number -+ number)
(Function 'poly computes value of a polynomial
for a given base: 10, 16, 0.1, whatever.}
I can see two ways of doing this:
1. Provide a global repository of patterns:
-------------------------------------------
A small sample follows.
patterns: [
map ((a -+ b) -+ [a] -+ [b])
filter ((a -+ logic) -+ [a] -+ [a])
foldl ((a -+ b -+ a) -+ a -+ [b] -+ a)
poly ([number] -+ number -+ number)
foldl ((a -+ a -+ a) -+ [a] -+ a)
scanl ((a -+ b -+ a) -+ a -+ [b] -+ [a])
cat ([[a]] -+ [a])
.. ([ord] -+ [ord])
odd? (ord -+ logic)
even? (ord -+ logic)
]
As you can see, here is a mixture of few existing
standard functions and some of my own. Do note
that I have given very specific patterns to 'odd?
and 'even? - more specific and safer that the official Rebol
definitions would imply.
The typeset 'ord means a set of orderable
things:
>> typesets/ord
== (number char date money time)
A very extensive table of patterns could be produced
consisting of all the patterns for all existing standard
functions.
2. Embedding patterns inside the new functions
----------------------------------------------
This alternative has several advantages
and disadvantages compared to the former method.
I will skip their discussion for the moment but instead
show a source of the 'sum function, that has been created
using this method:
sum: formal-function [
{A wrapper around a :+ operator
to limit its usage to typeset 'ring.
This example also shows one of two
ways to specify formal usage 'pattern
for a function.
(ring -+ ring -+ ring)
}
x [(typeset ring)]
y [(typeset ring)]
/local result [(typeset ring)]
][
pattern (ring -+ ring -+ ring)
][
result: x + y
result
]
3. Existing typesets can be captured
and stored in the global block 'typesets
-------------------------------------
The example shown here demonstrates my laziness
in the name invention. A good deal of thought should
be required to make it palatable.
typesets: [
generic (a b c d e f)
or-able (logic number char tuple binary image)
add-able (number pair char money date time tuple)
multiply-able (number pair char money time tuple)
skip-able (series port)
bind-able (block word)
find-able (series port bitset)
exclude-able (series bitset)
difference-able (series bitset date)
parse-able (block string none)
save-able (file url binary)
set-able (any-word block)
pick-able (series pair event money date
time object port tuple any-function)
ord (number char date money time)
ring (number money tuple pair)
]
============================================
That's it! Thank's for you attention.
I will post the prototype typechecker as soon as
I am comfortable with it more or less. It will
override the file %signature.r
Jan