[REBOL] Re: Coffee break problem anyone?
From: SunandaDH:aol at: 7-Nov-2003 10:28
Hi Joel,
Thanks for some fascinating discussion points. I've only got time to reply to
a couple.
If think we'd pretty much agree that a problem solution depends on the
structure of the problem. But we might disagree in specific cases about the best
approach to take.
> I'm not clear on what you mean by "both approaches"... I view JSP
> (Jackson Structured Programming) simply as a specific type of
> structured programming which has strong heuristics about which
> structure(s) should drive the design.
Sorry. I was being vague. I was distinguishing a process-oriented approach
(what do I need to do?) from a data-oriented approach (what data structures do
I need to work on?)
Your point is that it is usually best to start with the data structures. I'd
agree. But it isn't always that simple. And starting with processes can (and
probably does) lead to quite characteristically different solutions. Which is
best
is often subjective -- or "proved" by the passage of time and the
ravages that maintenance makes to the original solution.
A sort of analogous example.....
I went to London a couple of weeks ago. I could describe that trip in many
different ways. Here's three:
1. In terms of processes (I used a folding bicycle and a train).
2. In terms of waypoints ("data"): Home --> Station --> Station -->
Destination
3. In terms of purpose (to discuss an organization's global internet
strategy).
I could have used the same processes if some of the intermediate waypoints
had been different (Home --> Station can be via a canal towpath, a cycle track,
or duking it out with the traffic; there is more than one train route to
London).
But I would have needed to use different processes (taxi, car, plane) if the
waypoints had been slightly different (via New York for example).
And lots of things would have had to change if the purpose had been different
(say I was delivering a sofa).
Back to IT design, it may be better, in some circumstances, to start with
why am I doing this?
rather than "what am I doing it to?" or "what am I doing
it with?".
> > -- The numbers are RGB values from a JPG. Find the average color tone
> > of the object the female nearest the camera is looking at. Break out
> > the neural networks for this one!
> Now we've jumped from a programming problem to a specification problem!
> The other tasks were well-defined (there could be no argument about
> whether an answer is correct or not), but not so with this last one.
> (I recently saw a documentary on TV in which a critical question was
> whether a certain figure in a Renaissance painting was male or female.
> The "experts" couldn't agree. ;-)
I'm not so sure I completely agree. Two reasons.
First, taking a purely procedural approach, I can start a solution to this
problem:
females-in-image: copy []
foreach item my-image
[item-type: analyze-item item
if item-type/female [append females-in-image item]
]
This highlights both the strength and the weakness of a top-down procedural
approach.....
I can easily map out the code at the highest levels. But need faith that I
will be able to fill out the lower levels. Today, analyze-item isn't going to be
easy. In a few years time, such functions might come free in cornflakes
packets.
My second reason is that it is *always* a matter of interpretation, even with
the original "well-defined" task.
The actual original task was to find the best why to describe the differences
between two version of the same file. There is a lot of subjectivity there.
Examples:
orginal-file: ["A" "B" "C"]
new-file: ["B" "C" "A"]
Topographically, of course, these are identical. But that isn't the answer
that is going to get the program signed off.
The "simplest" description of the differences, in terms of how to transform
one into the other. is:
-- Delete ["A" "B" "C"]
-- Insert ["B" "C" "A"]
That's not going to win any prizes either, although it would be "best" if the
files had been:
orginal-file: ["A" "B" "C"]
new-file: ["X" "Y" "Z"]
An acceptable solution might be:
--Insert ["B" "C"] at start
--Delete ["B" "C"] at end
A "better" solution might be:
--Delete ["A"] at start
--Insert ["A"] at end
But "best" and "better" depend on the resources available. In this case, they
are a little restricted:
** REBOL has a limited recursion stack (2000-odd items) so an off-the-shelf
solution using recursion is out of bounds.
** Even unravelling the algorithms to remove the recursion doesn't help --
take a look at GNU's Diff.c for example. It's full of dire warnings about the
algorithm being "to expensive" in some conditions. We wanted something that was
reasonably deterministic (timing scaling according to the size of files, not
the number of differences).
That led to a serious of hacks (I'm not proud of the design) that takes a
series of linear passes at the two files with a few sorts. It works, I'm happy.
But we can still argue over the "best" way to describe some changes. Is:
orginal-file: ["A" "A" "A"]
new-file: ["A" "A" "A" "A"]
an insertion at the beginning? At the end? An intermediate point? Or some
combination of moves, deletions and insertions?
So the original, apparently "well-defined", task emerged from an interplay of
the algorithm design (which was constrained by current limitations in REBOL),
combined with a subjective decision to use REBOL rather than Python or C) and
a subjective debate on the "best" way to chose between alternative
presentations of the results.
The experts finally had to agree in this case, but we're still mumbling over
some of the fine details. Even the most seemingly objective task design has
hidden subjective depths.
Thanks again for the stimulation. Now I really do need a coffee break,
Sunanda.