Mailing List Archive: Re: make-doc-pro: how to handle tables?

[REBOL] Re: make-doc-pro: how to handle tables?

From: joel:neely:fedex at: 23-Sep-2001 22:53


Hi, Gregg,

Gregg Irwin wrote:
> <<
>     Employee ID First Name Last Name   Phone Nr
>     -------- -- ----- ---- ---- ------ ----- --
>     1234        John       Doe         555-1212
>     2345        Jane       Doaks       555-1213
>     3456        Ferdinando Quattlebaum 555-1214
>     4567        Mary Ellen Van Der Lin 555-1215
> >>
>
> This is a great example! I would consider this to be a bad
> table layout because it *is* confusing.
>

But it isn't confusing, is it?  Since you're an intelligent
human being, you understand enough about the *meaning* of the
data in the columns (and about interpreting the rows of the
table in the context of the entire table) that you could
probably see four columns instantly.

> Why would you build such a tightly spaced table (outside of
> being an example here)?
>

This was only an illustration.  I might need to use tight spacing
because there's lots of data to get on each line and I'm trying
to stay within narrow text margins.

The rest of my answer would be very similar to your next comment.

> If I use tabs, I get a wider spacing than I would like for
> this table, but it clearly delineates the columns.
>

The more spaces I have to type, the more typing effort is
involved.  In addition, whether I use spaces or tabs, if I have
lots of data to present, I want to waste as little space as
possible.  So, I agree with your comment that the example
below is "a wider spacing than I would like", or than will fit
on the page.

> Employee ID             First Name              Last Name               Phone Nr
> -------- --             ----- ----              ---- ------             ----- --
> 1234                    John                    Doe                     555-1212
> 2345                    Jane                    Doaks                   555-1213
> 3456                    Ferdinando              Quattlebaum             555-1214
> 4567                    Mary Ellen              Van Der Lin             555-1215
>
> << OBTW, tabs are of absolutely no use in solving this problem...
>
> Right! Yes! Absolutely! So, naturally, you would type *another*
> tab to add space before the start of the next column. Type up
> a table of text in your favorite word processor. How do you do
> it?
>

If I were using a word processor (or a desktop publishing program),
I would type exactly one tab between successive columns.  I would
then adjust the positions of the tab stops to provide adequate
horizontal separation between the columns.  Unfortunately, simple
text processors (and most utilities that just dump plain text files
to screen or printer) have fixed tab stops (e.g., every 8 characters)
which usually means that some tab positions are too wide and others
are too narrow (when we're dealing with typical text/data).

> <<
> > Whitespace *is* visual feedback.
> >
> Bah!  Humbug!  Horsefeathers!  See the proofreading exercise
> above.  Counting nonprinting characters to figure out *what*
> *kind* of delimiter (between words *in* a column or between
> columns) is an awful burden to lay on a poor human (technical
> or not!).
> >>
>
> Right again! We're not counting characters at all, we never do.
>

Distinguishing one space from multiple spaces requires counting,
even if it's only counting from 1 to 2.  But this begs the
question of what "multiple" means!

> What we're doing is creating "visual groupings" of data in
> columns. They must be visually distinguishable for us to make
> sense of them. If they are laid out so that they might confuse
> a human reader, then I would expect a parser to get confused
> as well. Now, a human can take a few extra seconds to try and
> make sense of the data, as I did with your first example table.
> Can a program do the same thing? Sure. As you mention, this is
> no easy task, and beyond what make-doc-pro should be expected to
> do, but a program should be able to scan the data and take a
> few extra seconds to try and determine so heuristics that make
> the most sense of things. Worst case, it could ask. "Hey, are
> there 4 columns in this table (as in the example I'm displaying
> to you now in my disruptive dialog box)?"
>

What we *intend* to do is create visual groupings.  What we
actually do is type characters.  What we must do if we want a
program to read our minds and discover our intentions is to
make precise statements about the rules of inference for
deciding where the "visual groupings" are to be delimited.

We certainly agree that writing such telepathic code in the
absence of clear specifications "is no easy task, and beyond
what make-doc-pro should be expected to do".  However, this
thread began with the question of how to incorporate tables
into make-doc-pro or similar tools.  So we seem to be agreeing
that writing heuristic code that can make sense of our use of
whitespace is beyond the scope of a make-doc-style program.

> <<
> > Do we need delimiters, beyond whitespace, in our REBOL code?
> >
> We certainly do.  Including  [  ]  (  )  {  }  and  " .
> >>
>
> Sorry, my meaning wasn't clear. Here's an example of what I meant.
> Using only whitespace, we can type this:
>
>         emit: func [data] [append out reduce data]
>
> Using another delimiter (|), we get this:
>
>         emit:|func|[data]|[append|out|reduce|data]
>
> That's the simple case I wanted to make.
>

But it illustrates the fallacy I was emphasizing.  In the above
example, the whitespace is ONLY used for a single "level" of
logical grouping.  All other structure is shown by means of the
other delimiters.  Using your example, we can agree that

    emit: func [data] [append out reduce data]

and

    emit: func [
        data
    ][
        append  out  reduce data
    ]

mean exactly the same thing, variations in whitespace being of
no significance.

As far as I can tell, a table has four levels (not counting the
level of individual characters!):

1)  individual tokens/words/symbols within a cell
2)  a single cell, made up of such tokens/words/symbols
3)  a single row, made up of cells
4)  the entire table, made up of rows

Using varying amounts of whitespace (horizontal or vertical) to
distinguish among these levels seems to risk the same possible
errors (of typing or reading) as if we used varying whitespace
in the code example

    emit   func   data   append out reduce data

where appearing at the left margin makes a word a set-word, a
gap of multiple spaces begins or ends a block, and single spaces
separate words within a block.  I hope none of us would argue
in favor of such a "natural, punctuation-free" version of REBOL!

> <<
> In my book things such as
>
>   \table
>   /table
>
> or underlining with hyphens or equal signs are just as much
> "syntax" that has to be learned and understood for effective
> use as is
>
>     ~|Employee ID|First Name|Last Name  |Phone Nr
>     ~|-------- --|----- ----|---- ------|----- --
>     ~|1234       |Johannes  |Doe        |555-1212
>     ~|2345       |Betty  Sue|Doaks      |555-1213
>     ~|3456       |Ferdinando|Quattlebaum|555-1214
>     ~|4567       |Sue  Ellen|Van Der Lin|555-1215
> >>
>
> I disagree...somewhat. Some of this has to be couched in the
> context of the make-doc discussion. If I were just typing a
> document, I would probably do something like this:
>
>                         Table 1.
>         ---------------------------------------
>         NAME            PHONE           E-MAIL
>         ---------------------------------------
>         Bob Jones       000.0000        [Bob--Jones--com]
>         Mary Smith      111.1111        [Mary--Smith--com]
>
> Can make-doc make sense of this? Let's see...blank line, then
> a line is indented and starts with the word "table", then a
> line of dashes, words in all caps (make note of what character
> column they fall on), ...
>

Now you're just making up another set of rules on the fly,
with yet more opportunities for a poor human to make an
error (typo, failing to remember a rule, etc.)  What happens
if the typist doesn't remember to put NAME, PHONE, and
E-MAIL in all caps?  Rules are still rules, regardless of
which keys on the keyboard they require!

> I believe the standard is just to leave the cell empty or to
> put an em dash in the cell. In our case, the latter seems the
> obvious choice for ease of implementation.
>

Leaving it empty creates ambiguous runs of whitespace unless
there is a column delimiter.  As for the alternative, my
ASCII keyboard doesn't have an "em-dash" key.  Requiring
something (doubled hyphens, for example) to represent an
empty cell is certainly plausable, but again constitutes yet
another rule whose violation might have consequences.  More
rules again.

> << IOW, simplicity is achieved by controlling the rate at
> which features are discussed with the user, not by making
> a limiting choice that actually causes subtle problems
> (like remembering to count spaces). >>
>
> This logic is somewhat counterfeit...
>

You're entitled to your opinion, but I've actually deployed
a system using this kind of approach and have had no problems
with it (nor its users) to date.

> <<
> Now, since you want to take up keystroke counting next, answer
> the same question with this version.
>
>     ~|Widget A|Widget B|Widget C
>     ~|50%|10%|3%
>     ~|500,000|100,000|3,000
>
> How many keystrokes did we save by not fooling ourselves into
> believing that we had to make the columns line up?  By using
> an explicit delimiter, the typist gets to choose whether to
> do so or not, and to choose to save the maximum number of
> keystrokes if she/he wishes to do so.
> >>
>
> Another valid case. Now, my *goal* is not to save keystrokes,
> that just comes along as a happy side-effect. I originally did
> some delimited layouts to post, showing the very issue you
> mention:
>
> a|b|c|d|e|f|g
> 12|23|34|45|56|67|78
> acb|def|ghi|jkl|mno|pqr|stu
> $12.00|$24.50|$37.75|$1.00|$0.50|$0.25|$0.01
>
> I didn't include them in my original post because this is so
> horrific, from a layout perspective, that I can't imagine
> anyone even considering it as an option and I didn't think it
> was fair to hit you with it. :)
>

Since we're confessing... ;-)  I had originally planned on
including the following related issue (but had writtent too
much already).

Those who have data in spreadsheet or database format can
easily export it to delimited text, then cut and paste the
data into their document.  Why should they have to do all of
the manual effort to replace the explicit delimiters with
varying amounts of whitespace just to make the columns line
up?  That's what we employ computers for!  They can count
columns and spit out "tidy text", HTML, RTF, PDF, TeX, or
whatever, and save us the tedium.

> For me, and I could be alone in this, creating delimited tables
> using ASCII chars as we are discussing, always has me playing
> the back and forth game to get my delimiters to line up, then
> that moves my data, so I have to readjust that, etc. Could just
> be that I haven't been forced to come up with a clever solution
> to my problem since I can avoid it.
>

The solution is to have the program include a "tidy text"
option to deal with all of that time-wasting character
shuffling.  Then you can

1)  Type (or cut-and-past) minimal with minimal effort.
2)  Convert to tidy-text for those (few?) cases where text
    is really the right medium.
3)  Convert to HTML (etc.) for the other cases.

... and in all cases, let the computer do the tedious work
of shuffling characters and counting columns.

> <<
> Literate humans (whether "users" or not)
> are completely familiar with using punctuation as a way of
> showing the structure of text.  Periods at the ends of
> sentences, commas between elements of a list, horizontal rules
> between sections of a document, etc. are all well within their
> comfort zone.
> >>
>
> Agreed. Those are conventions we are taught. Using a pipe symbol
> as a delimiter between pieces of text is not. This is where we
> diverge in our view as I consider this to be a computer/programmer's
> convention and not a normal language/grammar convention.
>

I think anyone who understands that commas separate words or
phrases in a list can learn to use a column separator in a
few seconds.  They already understand the concept of delimiter;
I'm just reusing it.  That seems to be where we differ.

> Thanks for a fabulous discussion! Whatever the result, I'm always
> energized by this kind of constructive discourse (I love that word). :)
>

Thank you!  I appreciate the exchange of ideas!

-jn-

--
; Joel Neely  [joel--neely--fedex--com]  901-263-4460  38017/HKA/9677
REBOL []  foreach [order string]  sort/skip reduce [ true "!"
false  head reverse "rekcah"  none "REBOL "  prin "Just " "another "
] 2 [prin string] print ""