Cross-language benchmarking
[1/5] from: joel::neely::fedex::com at: 5-Nov-2002 11:38
Hello, all,
Pardon my manners in not replying to individual remarks, but "so many
emails, so little time!" ;-)
Let me raise a couple of points about what we might accomplish with
such an effort, and who might be in the audience, as a way of thinking
about how such an effort might be done, and what languages might be
included.
Option 1
========
Goal: Demonstrate to the world that REBOL is a viable,
competitive language for many common programming tasks
(where "competitive" is defined in terms of run time
performance).
Audience: Developers at large.
Languages: Commonly used languages, to maximize likelihood that the
reader will be familiar with one or more of the other
"competitors": c/c++, Java, Perl, Python, VB, ...
Tasks: Toy problems that allow specific aspects of performance
to be instrumented/measured. Also some small "real"
tasks in REBOL's "sweet spot" of performance.
Comment: The tests must be fair, and must be seen to be fair.
We've all seen the kind of unvarnished advocacy that
claims things like "X is faster than Tcl, uses fewer
special characters than Perl, and is cheaper than
COBOL on an IBM mainframe, therefore it's the best!"
which only hurts the credibility of the claimant.
Option 2
========
Goal: Demonstrate to the world that REBOL is a viable notation
for use in RAD/prototyping tasks, and makes a good "tool
to think with".
Audience: Same as (1)
Languages: Same as (1)
Tasks: Reasonably small "spike" (in the XP sense) tasks that
would be recognizable as related to application-level
needs.
Comment: It's also fair to include code size and programmer effort
in such discussions, but these are notoriously difficult
to instrument objectively.
Option 3
========
Goal: Identify and document REBOL programming techniques that
have substantial effect on performance, and build a
vocabulary of "performance patterns" for future use.
Audience: REBOL programmers.
Languages: REBOL only
Tasks: Those situations where reasonably small effort in
refactoring/strategy could produce significant gains in
performance.
Option 4
========
Goal: Help identify "hot spots" in REBOL where performance
optimization would have significant perceived value.
Audience: RT
Languages: REBOL and limited number of "compare for reference"
alternative languages.
Comment: I want to be clear on this one: I'm not suggesting that
RT staff are unaware of performance issues! Not at all!
However, several on this list (including RT staff) have
observed that the good folks at RT don't get any more
hours in the day than the rest of us (and therefore have
to pick and choose where to put their mere 20 work hours
per day ;-)
If the list members can help spread the load of finding
issues worthy of attention, help (by participation) to
indicate which performance issues are considered higher
priority than others, or even find one glitch that has
escaped notice to date, then I think the effort would be
a net gain.
General Comment
===============
Benchmarking is tricky business at best, and A Dark Art at worst.
For results to be meaningful, the sample base must be large enough
(and the indidual tests must be large enough) that transient or
special-case effects get averaged out (e.g., garbage collection
that happens now-and-again during series flogging, differences in
performance due to different CPU speeds, free memory, disk I/O
bandwidth, network bandwidth/congestion, concurrent processing on
computers with real operating systems, etc).
It will be of little use (except to the submitter! ;-) to have a
single benchmark comparing REBOL to Glypnir on an Illiac IV. The
strong benefit IMHO to using primarily cross-platform languages is
that it allows us to perform the tests under the widest possible
range of conditions, thus improving the quality of our averages.
That said, there's probably room for a widely-used proprietary
language (e.g., VB) since that's likely familiar to a significant
portion of the target audience for options (1) and (2). We just
need to be careful to have the widest possible set of alternatives
run on *the*same*boxen* as the proprietary cases, so that we can
make meaningful use of the results. (E.g., a single comparison
of REBOL vs C++ on a box running XPpro would be hard to interpret
beyond the obvious "which one is faster?")
-jn-
--
----------------------------------------------------------------------
Joel Neely joelDOTneelyATfedexDOTcom 901-263-4446
[2/5] from: edanaii:cox at: 5-Nov-2002 12:25
Joel Neely wrote:
[Snip]
>Option 1
>>
<<quoted lines omitted: 28>>
> in such discussions, but these are notoriously difficult
> to instrument objectively.
I personally believe that option two is the best choice. However, there
is no reason why submissions can't be categorized. Speed a criteria,
where it makes sense, Intuitiveness, code size, programmer effort where
that makes sense.
In terms of the site that I was contemplating, since it was meant to be
a "how to" sight, option 2 fits this scenario best. Option one, IMHO, is
more likely attract Computer Nerds. This is not a bad thing, in and of
itself, but you want professionals, trying to do there job, who would
hopefully stay and look at the competitions, if only out of professional
curiousity.
As for evaluating programmer effort, If a standard algorythm is used for
all comparisons, I think Lines of Code generated to meet the algorithm
is a good measure of effort. Also, since not all languages may be able
to implement all parts of the algorithm, since they do not all implement
solutions in the same way, "completeness", for lack of a better term,
would be an import standard, as well...
>General Comment
>>
<<quoted lines omitted: 19>>
>of REBOL vs C++ on a box running XPpro would be hard to interpret
>beyond the obvious "which one is faster?")
Well said and well written Joe.
As to cross-platform testing, the standard I would prefer to judge such
code by would be "completeness", similar to assessing effort. In other
words, does the same program perform as specified as it is tested on
differing hardware.
--
Sincerely, | We're Human Beings, with the blood of a million
Ed Dana | savage years on our hands! But we can stop it! We
Software Developer | can admit we're killers, but we're not going to
1Ghz Athlon Amiga | kill today. That's all it takes! Knowing that we're
| not going to kill... Today! -- Star Trek.
[3/5] from: carl:s:rebol at: 5-Nov-2002 16:16
Thanks Joel, for breaking it down. I like Option 1.
And, we know between us that we're not trying to convert everyone for all
uses, but offer a useful tool that we personally find to save us time in
the long run. A certain percentage of developers need just this kind of
comparison, and it's something that RT has been asked many times (although
you know that we never bash other languages). So, how can it be done?
-Carl
At 11:38 AM 11/5/02 -0600, you wrote:
[4/5] from: joel:neely:fedex at: 6-Nov-2002 7:56
Hi, Carl, and all,
OBTW, I didn't mean to imply that the options were mutually
exclusive. I believe we can satisfy multiple goals, as long
as we keep ourselves clear on what we're working toward.
See below:
Carl at REBOL wrote:
> And, we know between us that we're not trying to convert everyone
> for all uses, but offer a useful tool that we personally find to
> save us time in the long run. A certain percentage of developers
> need just this kind of comparison, and it's something that RT has
> been asked many times (although you know that we never bash other
> languages). So, how can it be done?
>
I suggest we pick a small set of languages, test with a small set
of benchmark tasks, publish the results, and let it grow over time
to include more languages and tasks as needed. I also suggest we
run the tests on multiple platforms (wxx, Unices, Linux, Mac OS X,
... ?others?) and average the normalized results to provide some
degree of platform neutrality. I suggest normalizing all run times
against c (= 1) to avoid dependence on CPU speed, etc.
> >Goal: Demonstrate to the world that REBOL is a viable,
> > competitive language for many common programming tasks
> > (where "competitive" is defined in terms of run time
> > performance).
> >
...
> >Languages: Commonly used languages, to maximize likelihood that the
> > reader will be familiar with one or more of the other
> > "competitors": c/c++, Java, Perl, Python, VB, ...
My nominees for languages are:
Language Reason
-------- --------------------------------------------------
c It serves as a baseline for optimal speed.
Java Widespread enterprise usage.
Perl Probably the most popular platform-neutral language,
and main "competitor" for 'net-related applications
on the back end (cgi, etc.)
Python Second only to Perl ...
I personally would be interested in some of the "academic" languages
(e.g. Scheme, Haskell, Prolog), but I'm *not* including those in my
list of nominees because I think they are insufficiently "mainstream"
to be relevant to most of the audience of Option 1 who would be
looking to build and deploy personally or professionally.
-jn-
--
; Joel Neely joeldotneelyatfedexdotcom
REBOL [] do [ do func [s] [ foreach [a b] s [prin b] ] sort/skip
do function [s] [t] [ t: "" foreach [a b] s [repend t [b a]] t ] {
| e s m!zauafBpcvekexEohthjJakwLrngohOqrlryRnsctdtiub} 2 ]
[5/5] from: joel:neely:fedex at: 6-Nov-2002 8:13
Hi again, Carl ...
... on a slightly different tack.
Carl at REBOL wrote:
...
> >Goal (1): Demonstrate to the world that REBOL is a viable,
> > competitive language for many common programming tasks
> > (where "competitive" is defined in terms of run time
> > performance).
> >
...
> >Goal (3): Identify and document REBOL programming techniques that
> > have substantial effect on performance, and build a
> > vocabulary of "performance patterns" for future use.
...
> >Goal (4): Help identify "hot spots" in REBOL where performance
> > optimization would have significant perceived value.
Speaking for myself, the Ackermann discussion has already given me
some ROI on goal (3) with ideas about refactoring expressions to
minimize depth/nesting.
Now, WRT goal (4) ...
This is totally a shot in the dark, as I have no clue about the
internal structure of the REBOL interpreter, but here goes anyway!
Testing inspired by Ladislav's comments would indicate that stack
is being consumed by native functions (such as EITHER) with the
consequence that user-level recursion depth is diminished. Perl
has a mechanism that allows a subroutine to delegate to another
subroutine in a way that does not increase the call stack depth.
In pseudo-REBOL notation, one can replace something like
foo: func [x y] [
either phase-of-moon x y [
bletch x y
][
;; otherwise, do something else
]
]
bletch: func [x y] [
;; transmogrify x y and return something
]
with
foo: func [x y] [
if phase-of-moon x y [become bletch]
;; otherwise, do something else
]
(for the same BLETCH). IOW, instead of creating a new frame and
invoking BLETCH with a subsequent return to FOO and thence to FOO's
caller, the evaluation simply twiddles the state so that BLETCH is
invoked (with FOO's arguments) and BLETCH returns directly to FOO's
caller upon completion (sort of vaguely like an exec() in Unix...)
Now, I'm *NOT* suggesting that we have such a mechanism in high-
level REBOL *AT*ALL*! But I'm wondering if it would be feasible
to allow NATIVE! functions to make use of such a mechanism in
special cases, so as to minimize the stack penalty of using IF,
EITHER, etc. Specifically, could e.g. the native code for EITHER
directly proceed to the evaluation of the first or second block
alternatives without nesting a call, since we're guaranteed (I
am assuming) that the result of the nested block evaluation is
actually going to be passed back to the expression where EITHER
appeared without any further manipulation?
-jn-
--
; Joel Neely joeldotneelyatfedexdotcom
REBOL [] do [ do func [s] [ foreach [a b] s [prin b] ] sort/skip
do function [s] [t] [ t: "" foreach [a b] s [repend t [b a]] t ] {
| e s m!zauafBpcvekexEohthjJakwLrngohOqrlryRnsctdtiub} 2 ]
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted