Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] The Virtues of Zero: Was "Correct" Behaviour? Was False = 2 ?

From: joel::neely::fedex::com at: 3-Jul-2001 3:12

Hello, all! I've already said enough (and more than enough!) on this topic. For me the issue is *not* The Language That Must Not Be Named, but rather simplicity, consistency, models, and Mathematics. With these few, somewhat random thoughts I respectfully bring to a close my participation in this topic. I'm not trying to agitate for the remaking of REBOL, nor am I trying to foment discontent. I am trying to make the case for the benefits of consistency, and for the long-view tradeoff of a little learning at first vs. a lot of busy-work forever. -jn- If one learns the simple rule "everything begins at zero" then much of computing (and Mathematics) simply falls into place. Peano's postulates for the natural numbers (aka "the counting numbers" in elementary school) are essentially: - zero is a natural number - the successor of a natural number is a natural number upon which the whole of Mathematics is based. Edsger Dijkstra (who, IIRC has no love for TLTMNBN) has been quoted as saying, "Zero is the most natural number." based on the above. (In fact, his custom for years has been to number the pages of his writings beginning with zero. His explanation is that each page is labeled with the number of pages that preceded it.) We call the point labeled zero on the number line "the origin", because that's where everything starts. It seems entirely reasonable to me to label the starting point of a block with zero, as that is the origin of the block. Labeling positions in a data structure is not the same as counting them. House numbers (especially within a single block!) are not consecutive, but this does not surprise anyone I've ever met. Since labels can start anywhere, and can increase in arbitrary quanta, the simplest labeling scheme is to begin at zero and increase by one. WRT to counting, we always implicitly start with zero. If my wife asks me to fetch a dozen apples from the store, the bag contains zero apples when I open it. As I add each apple, the count becomes the successor of the previous count. Shades of Peano! I am totally unconvinced by any appeal to the expectations of non-programmers for many reasons, including the fact that by definition their intuitions are untrained (or I could say trained for other tasks ). Their intuitions have not yet been trained with MANY useful bits of computing knowledge which may strike them as unexpected at first. * REBOL ignores the common conventions of high-school algebra with respect to expressions such as 4 + 5 * 2 and if counter + 1 = limit ... * The digit characters #"0" thru #"9" precede the alphabetic characters #"a" thru #"z". * Strings and integers sort differently (this one is more of a shock to their naive intuitions than any other fact, in my experience!)
>> foo: repeat i 100 [append [] form i - 1]
== ["0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "2...
>> sort foo
== ["0" "1" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "2" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "3" "30" "31" "...
>> baz: repeat i 100 [append [] i - 1]
== [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45...
>> sort baz
== [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45... (The following are all "new facts" in programming which become trivial consequences if one first learns the "everything begins at zero" rule.) * The first (lowest) digit is #"0" (although, in my experience, this fact belongs in the next category for most people !) * There are 256 possible characters, the first (lowest) of which is #"^@" = #"^(00)" = to-char 0 * The first IP address in any sub-net is zero (to some number of bits). I also think it insults the intelligence of the civilian to suggest that (s)he cannot understand zero as a starting point, given that (s)he already uses that knowledge in many everyday contexts: * The 24 hours of the day (on the increasingly-common 24-hour clock) are numbered 0 thru 23. * The 60 minutes in an hour are numbered 0 thry 59. * The 60 seconds in a minute are numbered 0 thru 59. (Incidentally, I've used the "labeling" of components of time as a teaching illustration many times over the years. It is usually all the instruction needed.) * A baby's age during its first year of life is zero years old. Its age during its second year of life is one year old. etc. ad mortem * Many common summaries and displays of information (even in such LCD publications as "USA Today") begin with zero. A graph of "number of children in household" resembling 0: ============= 1: ================= 2: ============ 3: ======= 4: ==== 5: ==
>5:
would confuse no-one by the fact that the first category was labeled with zero. There are many programming tasks that involve traversing a flat data structure (e.g. a block). If positions in the structure are labeled beginning at the origin with zero, many such tasks are simplified. * One can cycle through the positions using simple modular (clock) arithmetic:
>> pos: 3 == 3 >> len: 5 == 5 >> pos: pos + 1 // len == 4 >> pos: pos + 1 // len == 0 >> pos: pos + 1 // len == 1 >> pos: pos + 1 // len == 2 >> pos: pos + 1 // len == 3 >> pos: pos + 1 // len == 4
Note that this "cycling" can pick up and leave off at any point, and need not be tied to a particular "control function" with any origin assumptions. * All calculations over positions become simplified as "offsets" and "indexes" now share a common origin and are freely interoperable. * Remember the "number of children in household" histogram above? Calculating summaries from a data structure containing such counts has been a common task throughout my career. With 0-origin indexing, the code can use the raw counts directly as the indexes into the histogram data, while with 1-origin indexing, each such operation is complicated with the mandatory "+ 1". * This is typical of MANY situations where one must constantly deal with "+ 1" or "- 1", with the consequences that: * There are more opportunities for bugs (off-by-one errors are notoriously common and often hard to track down) * The code looks uglier because of the insertions * The code is slower because of the extra arithmetic ------------------------------------------------------------ Programming languages: compact, powerful, simple ... Pick any two! joel'dot'neely'at'fedex'dot'com