Updates to hof script

[1/17] from: jan:skibinski:sympatico:ca at: 16-Nov-2002 0:58

Hi, Several new functions has been added to hof script, such as map-2, fold-2, inner-2, . and | I'll just shortly describe map-2: USAGE: MAP-2 f xs ys DESCRIPTION: Mapping two series via binary function: ((a -+ b -+ c) -+ [a] -+ [b] -+ [c]) MAP-2 takes any binary function 'f :: (a -+ b -+ c) and applies it binary-wise to elements of series [a] and [b], producing as a result a series [c]. The types must match obviously. Map-2 is generic enough to support all kinds of basic arithmetic operations on two vectors, or two matrices, of any dimensions of course. map-2 :+ [1 2 3] [4 5 6] map-2 :subtract [1 2 3] [4 5 6] map-2 :* [1 2 3] [4 5 6] map-2 (func[u v][2 * u + (3 * v)]) [1 2 3] [4 5 6] .: func[x y][sum map-2 :* x y] square-root sum map-2 (func[x y][(x - y) ** 2]) [1 2 3] [4 5 6] ; Euclidean distance sum map-2 (func[x y][abs (x - y)]) [1 2 3] [4 5 6] ; Manhattan distance One can use it for other purposes too: map-2 :xor "My secret message" "This is my not-really-random secret key" Operator '| provides for overloading of operators for numbers, vectors and matrices, if one really insists on the same symbols. For example: | :+ 4 5 | :+ [1 2 3 4] [7 2 0 4] | :+ [[1 2 3][4 5 6]] [[1 1 1][2 2 2]] Needs some error handling, though. Jan P.S. I do not know whether anyone reported it yet (Ladislav?), but these few things really bug me: :- confusion subtract/negate :> syntax error :< syntax error

[2/17] from: brett:codeconscious at: 16-Nov-2002 17:18

Hi Jan, Map-2 looks interesting.

> Mapping two series via binary function: > ((a -+ b -+ c) -+ [a] -+ [b] -+ [c])

I have to say though, the function signatures you include with your functions always remind me of marks in the sand made by the feet of birds! :^) Seriously I've not understood one of them, and I did read your post where you introduced them. Maybe there is a more accessible / more self-evident notation for this? I don't think I would be the only one on this list that has difficulty. Brett.

[3/17] from: jan:skibinski:sympatico:ca at: 16-Nov-2002 2:39

Hi Brett,

> Map-2 looks interesting. > > > Mapping two series via binary function: > > ((a -+ b -+ c) -+ [a] -+ [b] -+ [c]) > > I have to say though, the function signatures you include with your > functions always remind me of marks in the sand made by the feet of birds! > :^) >

Hi-hi, I had exactly the same impression when I was first introduced to tensor notation - with all those superscripts, subscripts - covariant, contravariant stuff. I got used to it and even became an expert.

> Seriously I've not understood one of them, and I did read your post where > you introduced them. Maybe there is a more accessible / more self-evident > notation for this? I don't think I would be the only one on this list that > has difficulty. > Mapping two series via binary function: > ((a -+ b -+ c) -+ [a] -+ [b] -+ [c])

I am sure there are better ways to explain this. But all I can do for now is to go through it again - but this time ignoring a type checker. Here are 4 groups of types: 3 of them represent arguments, the last one is a return type. The types here are generic: a, b, c - as generic as they can be, including any-type!. You can always substitute a generic type by a more concrete type, as in: a ==> decimal. But you need to be consistent and do it everywhere, changing the original pattern to something like that: ((decimal -+ b -+ c) -+ [decimal] -+ [b] -+ [c]) Of course 'b can be also 'decimal, so it can be 'c, as in: ((decimal -+ decimal -+ decimal) -+ [decimal] -+ [decimal] -+ [decimal]) Now the type is completely resolved to a certain specific case. First argument: (decimal -+ decimal -+ decimal) represents a function, that takes two arguments of type 'decimal and returns 'decimal. An example of such a function is :+. Square brackets signify series. I could be here more specific and try to distinguish between specific kind of series, such as block versus string, etc. But for a start, throwing everything into one "series" bucket should do. So reading the final version of this specific pattern: MAP-2 is a function that accepts a binary function of the type (decimal -+ decimal -+ decimal) and two series made of decimals. It produces a series of decimals as a result. Accordingly, the following application is legal: map-2 :+ [1 2 3] [1 1 1] == [2 3 4] Now, if you look again at the original definition:

> Mapping two series via binary function: > ((a -+ b -+ c) -+ [a] -+ [b] -+ [c])

it shows a lot of choices and a lot of potential for you to choose from. Let's think: what sort of functions, other than (decimal -+ decimal -+ decimal) could we use with map-2? Maybe some sort of comparison function? How about this one? map-2 :greater? [1 8 3] [4 5 6] If you run 'help on greater? you will find that its corresponding signature is: (any -+ any -+ any) But help does not tell the truth here! In fact, this should be: (ord -+ ord -+ logic) You can only compare things that are orderable, such as numbers, dates, times, strings, characters, but not points for example. You cannot say whether or not 1x4 is greater than 4x9. An obviously the result should be logic! - either true or false. But at least number of arguments is correct. So what does it do? map-2 :greater? [1 8 3] [4 5 6] == == [false true false] The point I am trying to make is that the patterns like this:

> ((a -+ b -+ c) -+ [a] -+ [b] -+ [c])

supposed to be self-documenting. They should also inspire you to think hard about possible reusability of such patterns. Hmm, what would this do? ((string -+ integer -+ string) -+ [string] -+ [integer] -+ [string]) OK, firstly I would need a binary function that takes a string and an integer and produces string. Name and index perhaps? A lookup function to some translation table? Perhaps indices could stand for some foreign language codes? 1 for Swedish, 2 for Czech perhaps? map-2 :lookup ["John" "Frank"] [1 3] == ["Jan", "Frantisek"] Perhaps I should post a separate script hof-examples.r with a lot of well documented examples. What do you think? All the best, Jan

[4/17] from: g:santilli:tiscalinet:it at: 16-Nov-2002 10:50

Hi Jan, On Saturday, November 16, 2002, 8:39:58 AM, you wrote: JS> I am sure there are better ways to explain this. But all JS> I can do for now is to go through it again - but this JS> time ignoring a type checker. I think the problem here is with the notation. Most people is not familiar with it, and it is not intuitive enough to be able to guess what it means without someone telling you.

>> Mapping two series via binary function: >> ((a -+ b -+ c) -+ [a] -+ [b] -+ [c])

Would a notation such as: ((a b -> c) [a] [b] -> [c]) be more easy to understand by a casual reader? Regards, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

[5/17] from: jan::skibinski::sympatico::ca at: 16-Nov-2002 7:26

Hi Gabriele,

> I think the problem here is with the notation. Most people is not > familiar with it, and it is not intuitive enough to be able to

<<quoted lines omitted: 4>>

> ((a b -> c) [a] [b] -> [c]) > be more easy to understand by a casual reader?

I hope you don't mean: using -> instead of the -+. That was the intention to start with but Rebol would not let me use it. So the latter is just a poor imitation of the former. This cosmetic change aside, there is a problem with your suggestion. More complex types, which I hope to avoid at least at this stage, need so-called type constructors, such as 'Ket, as in: Scalar -> Ket a -> Ket a Abusing the definition: the type constructor is sort of a container with a generic type 'a inside. There could be therefore: Ket integer, Ket logic or even Ket (Ket tuple) Some other type constructors would need more than one type variables, such as Tree a b To be frank, when representing series I should have used such notation perhaps to be more specific about kind of series I have in mind, such as: Block a, Paren a, etc. Instead, I simplified everything to [a]. Not very strict, but at least showing the structure: [a] for vectors, [[a]] for vectors of vectors, etc. So we need some kind of separators, that would tell us where the types start and stop. Parens will not do, because they are needed for separating embedded functions. Generally accepted separator is ->, because it shows the direction "from to", as in length? :: [a] -> integer which reads: 'Length? has a type from a series 'a to integer. You feed it a series of any generic type and it responds with an integer. Whenever you see an arrow, this tells you that you deal with a function. If there is no arrow this mean just a simple data. A function can be completely unapplied, partially applied and completely applied, as in: :+ :+ 2 :+ 2 3 The corresponding types would be :+ :: integer -> integer -> integer "+ has type from integer to integer to integer" :+ 2 :: integer -> integer "+ 2 is a function that has type from integer to integer" :+ 2 3 :: integer "+ 2 3 has type integer" During this process you just remove one type and one arrow at each step, whenever you add one more argument. Oh boy, I should go back to bed for another hour or so. Did not have much sleep last night. Best wishes, Jan

[6/17] from: reffy:ulrich at: 16-Nov-2002 8:28

Hi Jan, Accordingly, the following application is legal: map-2 :+ [1 2 3] [1 1 1] == [2 3 4] Now, if you look again at the original definition:

> Mapping two series via binary function: > ((a -+ b -+ c) -+ [a] -+ [b] -+ [c])

How would you note that the result of (adding two decimals) yields a decimal unless, the operation overflows, and the result is coerced to a float instead of a decimal? Dick

[7/17] from: g:santilli:tiscalinet:it at: 16-Nov-2002 16:10

Hi Jan, On Saturday, November 16, 2002, 1:26:56 PM, you wrote: JS> I hope you don't mean: using -> instead of the -+. That's what happen when I don't try things on the REBOL console before posting. :-) Hmm, the closest representation is probably:

>> #-->

== #--> So you get:

>> [(a b #--> c)]

== [(a b #--> c)] JS> This cosmetic change aside, there is a problem JS> with your suggestion. More complex types, which JS> I hope to avoid at least at this stage, need JS> so-called type constructors, such as 'Ket, as in: Hmm, I understand. A wild idea could be using paths, such as Ket/a. Or, one could use a notation that is more similar to the parse dialect. I.e. like this: (a -+ b -+ c) becomes (a b #--> c) while ([a] -+ [b] -+ [c]) becomes ([some a] [some b] #--> [some c]) and you can also write something like: a: [integer! | [some integer!]] (a #--> integer!) and so on... Just my two cents, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

[8/17] from: jan:skibinski:sympatico:ca at: 16-Nov-2002 11:23

Hi Dick, [reffy--ulrich--net] wrote:

> Hi Jan, > Accordingly, the following application is legal:

<<quoted lines omitted: 5>>

> How would you note that the result of (adding two decimals) yields a decimal > unless, the operation overflows, and the result is coerced to a float instead of a decimal?

For all I know a decimal in Rebol is a float. If you meant integer, then yes I cannot predict what sort of casting Rebol will do during its runtime. But since certain objects cannot be converted then Rebol will complain - sometimes very very late, somewhere deeply inside some loop. My type checker could eliminate some classes of such problems. Compared to Rebol, Haskell would do a thorough static typechecking first - discovering such bugs and not even letting me start the computation. But this is not the point. I am not trying to predict what Rebol will in fact do during runtime, I only wish to provide tools that suppose to help us with documenting complex functions, and to let us check validities of some classes of expressions before they are even executed. Look, I am not trying to impose any restrictions on Rebol - it's fine as it is with all this casting, mixed type arithmetic, mixed types inside blocks, etc. But when I write my own function that accepts an array of specific types, such as decimals I want to document it as such and tell the user up front that they should better use decimals, not the strings or a mixture of both. If my intention would be to accept any mixed array, then I would document it as such, [any]. Typeless systems of Smalltalk kind, or weakly (dynamicly) typed systems such as Rebol are the blessing and the curse - compared to the strong static type systems. I've been on both sides of the border, and I know all about disadvantages and advantages of such languages. As I said, Rebol is fine for what it supposed to be. It provides freedom of choices, of creating little universes, such as dialects, to impose extra rules of the game. All the best, Jan

[9/17] from: jan:skibinski:sympatico:ca at: 16-Nov-2002 11:45

Hi Gabriele, These are all valid suggestions. I'll think about it. The notation is not cast in stone. I am still experimenting. For example, I have no decided yet on representation for Cartesian types. As an example, how do I represent types for such a block: [[3 "a"] [7 "c"] [10 "d"]]? In Haskell, heterogenous lists (here blocks), are not allowed (unless one uses existential quantification), so one have to use so-called tuples instead, as in: [(3, "a"), (7," c"), (10, "d")]. This list of pairs (tuples) has the type: [(Integer, String)] In Ocaml, the Cartesian types would be represented by something like: Integer x String and a list of such things as: [Integer x String] (I almost forgot the Ocaml syntax, so that might be incorrect.) Regards, Jan Gabriele Santilli wrote:

[10/17] from: reffy:ulrich at: 16-Nov-2002 11:25

Hi Jan, In my language, if adding two integers overflows, then I automatically coerce the result to a higher type. My question is more in line with liking your ideas, but wondering how one would include such possibilities in your notation.

[11/17] from: g:santilli:tiscalinet:it at: 16-Nov-2002 19:14

Hi Jan, On Saturday, November 16, 2002, 5:45:53 PM, you wrote: JS> [[3 "a"] [7 "c"] [10 "d"]]? In a PARSE-alike notation, one could write: [some [into [integer! string!]]] This notation has the IMO non-trivial advantage that you can use PARSE to check for validity:

>> parse [[3 "a"] [7 "c"] [10 "d"]] [some [into [integer! string!]]]

== true

>> parse [[3 "a"] [7 8] [10 "d"]] [some [into [integer! string!]]]

== false

>> parse [[3 "a"] 7 "c" [10 "d"]] [some [into [integer! string!]]]

== false Also, one could even be more specific on the valid content one can get; if you can accept empty blocks too, you can use ANY instead of SOME; if your blocks can contain from 4 to 8 integers, you can use the notation [4 8 integer!], and so on. Regards, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

[12/17] from: jan:skibinski:sympatico:ca at: 16-Nov-2002 13:15

Hi Dick, [reffy--ulrich--net] wrote:

> Hi Jan, > > In my language, if adding two integers overflows, then I automatically coerce the result to a higher type.

So do Smalltalk and Rebol. But all static languages I know of would fail here. Sometimes in quite a bad way - instead of crashing they would continue with wrong values. For example, Eiffel relies on a C compiler in its compile mode (It has also its native interpretive mode, useful during development - Melting Ice Technology, as they advertise it). Anyway when C register overflows it returns 0 - or at least that was a case with GNU compiler many years ago. No access to flags, so this is what you get.and Eiffel would inherit this problem. So much for safety and strong typing. See description of example of failure of Ariadne rocket, as popularized by Meyer. But that does not mean that Rebol would not fail because it does a silent coercion to decimals. It will fail at 2 ** 1024. That's astronomical number and why one would want to go that far? Well the example of the Ackerman function demonstrates how easy it is to get there. From the coercive perspective Smalltalk behaves better than Rebol because it switches from SmallInteger to LargePositiveInteger or LargeNegativeInteger, internally represented as sequences of bytes. Bypassing the float - unless you want to cast the numbers to float. Sky is you limit here, or rather the amount of memory Smalltalk is entitled to use for storing those huge numbers. And you must be of course aware of the fact that the bigger the float gets, the poorer resolution it has - there are tremendous holes between two closest float numbers. It's all about logarithmic scales. Some people still do not get it and try to use floats for computing with money.Think about all those missing chunks. Thinking about it now I have to examine the Rebol decimal! and money! a bit closer. I was assuming floats, but that cannot be the case in view of the above. Jan

> My question is more in line with liking your ideas, but wondering how one would include such > possibilities in your notation.

None that I know of. You can always declare and force using

[13/17] from: greggirwin:mindspring at: 16-Nov-2002 12:10

Hi Brett, BH> ...Maybe there is a more accessible / more self-evident BH> notation for this? I don't think I would be the only one on this list that BH> has difficulty. You're not the only one. I've followed the discussions, and I can see great value in Jan's contributions, but for me it's a lot of work to become comfortable with the notation and some of the concepts so they flow naturally through my brain. I tend to think down a slightly different path, simply because of my background, experience, and preferences. My mind leans towards dialects that allow us to specify these kinds of concepts in human friendly terms, even if they are more verbose and/or aren't used directly in our REBOL code (e.g. code generators). That's me, but I'm very appreciative of the other side (those smart people like Jan, Romano, Ladislav, Joel, Gabriele... :) and what I learn from watching them apply *their* view to things. The more variety we have in how REBOL is applied, the better IMO. So, thanks to all you folks who think at a higher order than me (pun intended :)! -- Gregg

[14/17] from: jan:skibinski:sympatico:ca at: 16-Nov-2002 14:23

Hi Gabriele, These are great examples. I have been already thinking about switching to PARSE, rather than to continue parsing the signatures the way I do it now. Which is OK, but shaky from debugging perspective. The good new about 'parse is that all debugging is related mostly to debugging of rules. Once this is OK, the rest supposed to be rock solid, isn't it? But one must execute some caution because [integer! string!] in a function definition would represent a set of choices, rather than a Cartesian product. I am still not convinced that the [some [into [integer! string!]]] is that much readable. Think about those long signatures. But I will take it seriously under consideration. Thanks for all your input so far. All the best, Jan Gabriele Santilli wrote:

[15/17] from: g:santilli:tiscalinet:it at: 16-Nov-2002 21:19

Hi Jan, On Saturday, November 16, 2002, 8:23:15 PM, you wrote: JS> But one must execute some caution because [integer! string!] JS> in a function definition would represent a set of choices, rather than JS> a Cartesian product. Indeed; and that could be expressed in PARSE notation as [integer! | string!] JS> I am still not convinced that the JS> [some [into [integer! string!]]] JS> is that much readable. Once you are familiar with PARSE, it is. I think that it is better having as few different notations as possible, so that people have less things to learn; since PARSE is already in, and a REBOL programmer is supposed to know it to some extent, I assume such a notation would look much more familiar to him/her. JS> Think about those long signatures. Indeed, that is likely to be a problem. Short or readable, pick one. (As Joel would say. ;-) Of course, mine is just an idea; maybe it can just be useful to make you think of something better. :-) Spending all my cents today, Gabriele. -- Gabriele Santilli <[g--santilli--tiscalinet--it]> -- REBOL Programmer Amigan -- AGI L'Aquila -- REB: http://web.tiscali.it/rebol/index.r

[16/17] from: brett:codeconscious at: 17-Nov-2002 12:42

Thanks Jan for your explanation - it makes more sense to me now. Reading your posts and Gabriele's posts on the syntax has been instructive too. No wonder I was confused the first time. :^) I haven't much to suggest regarding notation unfortunately, I can see you can create some fairly complex expressions. I was thinking about the vector notation of [] and vector of vectors with [[]]. Are these to be viewed as constraints over REBOL blocks or something else, an implied datatype in a sense? Regards, Brett.

[17/17] from: jan:skibinski:sympatico:ca at: 16-Nov-2002 22:05

Hi Brett,

> I was thinking about the vector notation of [] and vector of vectors with > [[]]. Are these to be viewed as constraints over REBOL blocks or something > else, an implied datatype in a sense? >

I would view them as constraints for now. But the notation may evolve to accept type constructors, as in: inner-product :: Matrix ring -> Matrix ring -> Matrix ring This would not tell us anything about specific data structures used to represent matrices. It could have been implemented as trees for all I know. Another implementation could use explicit blocks: inner-product :: [[number]] -> [[number]] -> [[number]] Both approaches could be valid as far as I can see. Jan

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted