Mailing List Archive: The Virtues of Zero: Was "Correct" Behaviour? Was False = 2 ?

[REBOL] The Virtues of Zero: Was "Correct" Behaviour? Was False = 2 ?

From: joel::neely::fedex::com at: 3-Jul-2001 3:12


Hello, all!

I've already said enough (and more than enough!) on this topic.
For me the issue is *not* The Language That Must Not Be Named,
but rather simplicity, consistency, models, and Mathematics.
With these few, somewhat random thoughts I respectfully bring
to a close my participation in this topic.  I'm not trying to
agitate for the remaking of REBOL, nor am I trying to foment
discontent.  I am trying to make the case for the benefits of
consistency, and for the long-view tradeoff of a little learning
at first vs. a lot of busy-work forever.

-jn-

If one learns the simple rule "everything begins at zero" then
much of computing (and Mathematics) simply falls into place.

Peano's postulates for the natural numbers (aka "the counting
numbers" in elementary school) are essentially:

    -  zero is a natural number
    -  the successor of a natural number is a natural number

upon which the whole of Mathematics is based.

Edsger Dijkstra (who, IIRC has no love for TLTMNBN) has been
quoted as saying,

   "Zero is the most natural number."

based on the above.  (In fact, his custom for years has been
to number the pages of his writings beginning with zero.  His
explanation is that each page is labeled with the number of
pages that preceded it.)

We call the point labeled zero on the number line "the origin",
because that's where everything starts.  It seems entirely
reasonable to me to label the starting point of a block with
zero, as that is the origin of the block.

Labeling positions in a data structure is not the same as
counting them.  House numbers (especially within a single
block!) are not consecutive, but this does not surprise anyone
I've ever met.  Since labels can start anywhere, and can
increase in arbitrary quanta, the simplest labeling scheme is
to begin at zero and increase by one.

WRT to counting, we always implicitly start with zero.  If my
wife asks me to fetch a dozen apples from the store, the bag
contains zero apples when I open it.  As I add each apple,
the count becomes the successor of the previous count.  Shades
of Peano!

I am totally unconvinced by any appeal to the expectations of
non-programmers for many reasons, including the fact that by
definition their intuitions are untrained (or I could say
trained for other tasks
).  Their intuitions have not yet been
trained with MANY useful bits of computing knowledge which
may strike them as unexpected at first.

*  REBOL ignores the common conventions of high-school algebra
   with respect to expressions such as

       4 + 5 * 2

   and

       if counter + 1 = limit ...

*  The digit characters #"0" thru #"9" precede the alphabetic
   characters #"a" thru #"z".
*  Strings and integers sort differently (this one is more of
   a shock to their naive intuitions than any other fact, in
   my experience!)

   >> foo: repeat i 100 [append [] form i - 1]
   == ["0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12"
  "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24"
  "25" "26" "2...
   >> sort foo
   == ["0" "1" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
   "2" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "3"
   "30" "31" "...
   >> baz: repeat i 100 [append [] i - 1]
   == [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
   22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
   42 43 44 45...
   >> sort baz
   == [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
   22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
   42 43 44 45...

  (The following are all "new facts" in programming which become
   trivial consequences if one first learns the "everything
   begins at zero" rule.)

*  The first (lowest) digit is #"0" (although, in my experience,
   this fact belongs in the next category for most people !)
*  There are 256 possible characters, the first (lowest) of
   which is #"^@" = #"^(00)" = to-char 0
*  The first IP address in any sub-net is zero (to some number
   of bits).

I also think it insults the intelligence of the civilian to
suggest that (s)he cannot understand zero as a starting point,
given that (s)he already uses that knowledge in many everyday
contexts:

*  The 24 hours of the day (on the increasingly-common 24-hour
   clock) are numbered 0 thru 23.
*  The 60 minutes in an hour are numbered 0 thry 59.
*  The 60 seconds in a minute are numbered 0 thru 59.

  (Incidentally, I've used the "labeling" of components of time
   as a teaching illustration many times over the years.  It is
   usually all the instruction needed.)

*  A baby's age during its first year of life is zero years old.
   Its age during its second year of life is one year old.
   etc. ad mortem
*  Many common summaries and displays of information (even in
   such LCD publications as "USA Today") begin with zero.  A
   graph of "number of children in household" resembling

       0:  =============
       1:  =================
       2:  ============
       3:  =======
       4:  ====
       5:  ==
      >5:
    would confuse no-one by the fact that the first category
    was labeled with zero.

There are many programming tasks that involve traversing a flat
data structure (e.g. a block).  If positions in the structure
are labeled beginning at the origin with zero, many such tasks
are simplified.

*  One can cycle through the positions using simple modular
   (clock) arithmetic:

       >> pos: 3                 == 3
       >> len: 5                 == 5
       >> pos: pos + 1 // len    == 4
       >> pos: pos + 1 // len    == 0
       >> pos: pos + 1 // len    == 1
       >> pos: pos + 1 // len    == 2
       >> pos: pos + 1 // len    == 3
       >> pos: pos + 1 // len    == 4

   Note that this "cycling" can pick up and leave off at any
   point, and need not be tied to a particular "control
   function" with any origin assumptions.
*  All calculations over positions become simplified as
   "offsets" and "indexes" now share a common origin and are
   freely interoperable.
*  Remember the "number of children in household" histogram
   above?  Calculating summaries from a data structure
   containing such counts has been a common task throughout
   my career.  With 0-origin indexing, the code can use the raw
   counts directly as the indexes into the histogram data, while
   with 1-origin indexing, each such operation is complicated
   with the mandatory "+ 1".
*  This is typical of MANY situations where one must constantly
   deal with "+ 1" or "- 1", with the consequences that:
   *  There are more opportunities for bugs (off-by-one errors
      are notoriously common and often hard to track down)
   *  The code looks uglier because of the insertions
   *  The code is slower because of the extra arithmetic

------------------------------------------------------------
Programming languages: compact, powerful, simple ...
 Pick any two!
              joel'dot'neely'at'fedex'dot'com