[REBOL] Re: Perl is to stupid to understand this 1 liner.
From: joel:neely:fedex at: 18-Dec-2001 1:37
Hi, Carl,
[Metaphysicomputational rambling ahead; proceed at your own risk!]
Carl Read wrote:
> On 18-Dec-01, Joel Neely wrote:
>
> > Whether a human manufactures the rules, or a piece of AI
> > software attempts to do so (and I suspect the human will
> > do a better job at this point in history), the problem
> > remains that the size of the rule set itself undergoes a
> > combinatorial explosion as we try to take into account the
> > variations in the data.
>
> Perhaps, instead of trying to make software understand documents
> written any old which way by humans, we should create strictly
> formal versions of current human languages that can be tested for
> correctness by computer? We'd then be able to have documents that
> could be examined by computer without the need to worry about an
> infinate number of special cases.
>
That's been tried before. It was called COBOL. ;-)
More recently, it's been tried again, and called mS Word. =8-0
Seriously, I think the proposal breaks down for two main reasons:
1) It merely displaces the issue -- whatever tool(s) enforce your
"formal versions" and "correctness" rules would *STILL* need
the rules to be defined, and users would *STILL* be annoyed
with poor performance, false positives, and false negatives.
2) It assumes that we know in advance which rules are to be
enforced based on possible future uses of the text being
created/vetted.
To elaborate (optional reading ;-)...
1) Word tries to vet spelling, grammar, punctuation, and typography
"on the fly" as the user is typing. This process is
a) is incredibly annoying/distracting, especially when writing
a draft of a document I know I will subsequently revise
and tidy up, but am trying to get "on paper" quickly;
b) merely displaces the recognition problem to mS, because
code/rules *still* must be designed to determine when
something resembles e.g., a date or phone number closely
enough that it can be put in "standard" format or challenged
with a "Did you really mean...?" message (see point (1.a)
above!); and this is even more annoying/frustrating when
the rules are wrong, incomplete, or inadequate; and
c) frightening, as I don't want a commercial entity taking
control of my language, regardless of their own agenda.
See
http://slashdot.org/article.pl?sid=01/10/26/1334257&mode=thread
for a story titled "Microsoft Edits English" that begins
"An article in the 23-Oct-2000 issue of the New York Times
... talks about how Microsoft has eliminated words from its
thesaurus so as to "not suggest words that may have offensive
uses or provide offensive definitions for any words". Entering
a word like "idiot" yields no hits in Word 2000 unlike the
numerous hits in Word 97."
d) problematic due to international/cultural variation; consider
the controversy over conversion of all European currencies to
the Euro, and the decades-old-and-still-barely-begun efforts
to get the US public to use the metric system. It is quite
clear to me that the *ONLY* reasonable way to write dates is
2001/12/18, and I can't understand why you haven't already
figured that out for yourself (...I'm JOKING!!! ;-)
2) Human language is "living" and dynamic in the same sense as
other human activities.
a) We may not know in advance that we'd need to scan my memos
from last year to find all of the email addresses, dates,
phone numbers, street addresses, names of people who worked
in the department then but have transferred to other jobs,
program names and version numbers, hostnames for servers in
one of our labs...
b) New usages, abbreviations, conventions, etc. are being
created all the time, because what we have to say, and the
frequency with which we have to say it, is constantly
changing; rigid standardization stifles expressivity and
leaves us in a bland, barren, and plastic-laminated mental
landscape -- ya' want fries with that memo?
c) Humans are excellent at recognizing patterns, even in the
presence of noise, many kinds of errors, and considerable
variation (even of the never-seen-before kind). When I'm
writing to another human being, I can move quickly because
I can trust her/him to understant me even if I make a tpyo.
Finally, the discussion of how dates appear in running text is
IMHO only a basic exemplar of a much more pervasive issue: whenever
we (and especially our programs/systems) interact with human beings
in the "real world" the burden should be on *us*and*our*artifacts*
to do the adapting to their way of doing/expressing things. That's
as true in the design of physical workflow as it is in the design
of computational artifacts.
As you can tell from my email address, I work for a big company that
employs lots of people in lots of places/cultures to do lots of work
that must happen with high speed and low (preferably zero ;-) error
rate. However, it is often the case that what works well in some
contexts (either physically or computationally) is suboptimal in
other settings.
Finding the balance -- and perhaps I should really say "keeping the
balance, in a constantly changing world" -- between standards
enforcement and flexibility for local/personal preferences/needs
is the underlying "fractal" challenge of which date formatting is
just the tiniest tip of the iceberg.
OBTW, let's not forget humor. See the random sig of the moment...
-jn-
--
Outside of a dog, a book is man's best friend. Inside of a dog, it's
too dark to read.
-- Groucho Marx
FIX?PUNCTUATION?joel?dot?neely?at?fedex?dot?com