Mailing List Archive: Re: RFC: Cross-language benchmark proposal

[REBOL] Re: RFC: Cross-language benchmark proposal

From: joel:neely:fedex at: 9-Nov-2002 8:09


Hi, Carl, and all,

I think we're making this WAAAAAY harder than it needs to be!
(And I'll accept my full share of responsbility...  ;-)

Carl Read wrote:
> On 09-Nov-02, Ed Dana wrote:
>
> > The algorithm for any program should be specified in simple
> > English.  It should be void of any technical references nor
> > should it suggest any technical solutions.
>

The issue which concerns me (and which I accept responsibility
for not communicating clearly) is that we're trying to compare
*language*implementations* and not *programmers*.  Therefore
we have to be clear about how much latitude is allowed to the
author of a test solution, and only compare solutions that are
fairly comparable.

Suppose we had a task specified as follows:

  - Generate a list/array of 10000 random integers, each
    between 1 and 1000000;
  - Sort the array into ascending order;
  - Sort the result into descending order.

Now suppose we get the following submissions:

  Program   Programmer   Algorithm      Language
  -------   ----------   -------------  --------
     A        Alice      bubblesort      REBOL
     B        Bob        quicksort       REBOL
     C        Clara      built-in SORT   REBOL

     D        Don        bubblesort      Perl
     E        Ethyl      build-in SORT   Perl

We could compare A and D to get an idea of how the languages
compare in their handling of looping and array/block access
speeds.

We could compare C and E to get an idea of how the languages
compare in their fastest possible sorting mechanisms (I'm
assuming that the built-in SORT facilities are faster than
someone could rewrite in interpreted code).

But comparing B with *any* other submission doesn't IMHO give
us any meaningful compare-REBOL-with-Perl information (although
it *does* tell us that Bob wrote a more sophisticated algorithm
than Alice, and it might tell us just how much gain a REBOL
programmer could realize by using SORT instead of a home-grown
function).

Given those five submissions, I'd suggest that we probably ought
to split that hypothetical problem above into two tasks:

 1) Solve the sorting problem above, using bubblesort according
    to the following pseudo-code specification ...; and

 2) Solve the sorting problem above, using the fastest sorting
    means you can come up with.

Then I'd view A and D as solutions to (1); C and E would make
good solutions to (2); I don't think it would be reasonable to
compare e.g. A with E (or B with D) and think I was thereby
making a fair compare/contrast examination of REBOL and Perl.

> > For example, the algorithm for the RandomCopy procedure I
> > mentioned earlier is:
> > * Specify Source Directory
> > * Specify Target Directory or File
>
> > 1 Get all the files in the Source Directory.
> > 2 Choose one file  randomly.
> > 3 Copy it to the Target File or Directory.
>
> > Stating it this way keeps it at a "logical" level and should
> > avoid any "subjective" issues as the only constraints here
> > are the resources being utilized. I.e. Files and directories.
>
> Okay.  And I'd write that something like this, (though with a filled
> header of course), ...
>
> rebol []
> dir: %test-dir/
> file: %test-dir/random-file
> write/binary file read/binary join dir random/only read dir
>

That seems quite reasonable for the problem as stated.  Now, let's
suppose that some novice C programmer wrote a submission that was
structured as follows:

    1)  Recursively traverse the entire file system from the root
        directory, performing the following for each fully-pathed
        file name encountered:

    1.1)   If the encountered name is a complete prefix match to
           the specified source directory, and there are no more
           slashes in the remainder of the encountered name,
           allocate a buffer to hold the name, copy the name into
           the buffer, and store a pointer to the buffer into the
           array of file names.  Oh, by the way, if that filename
           array is full, allocate a new one twice the size, copy
           all existing pointers into the new one, and then append
           the new name pointer.

    2)  After traversing the entire file system, shuffle the array
        of file name pointers, using a bubble sort algorithm with
        a comparison function that returns random results.

    3)  Open the file whose name is in the middle of the shuffled
        array.

    ...

I've already gone on too long, but pretend that the rest of the
program is as BBD as the above.  Now, what would we learn about
C and REBOL by comparing that program with your three-liner?

(For extra credit, what would we learn about the kind of twisted
mind that could come up with such an obviously wrong-headed
algorighm?  ;-)

-jn-

--
; Joel Neely                             joeldotneelyatfedexdotcom
REBOL [] do [ do func [s] [ foreach [a b] s [prin b] ] sort/skip
do function [s] [t] [ t: "" foreach [a b] s [repend t [b a]] t ] {
| e s m!zauafBpcvekexEohthjJakwLrngohOqrlryRnsctdtiub} 2 ]