Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: Not-too-smart Table Parser (was: how to handle tables?)

From: meekerdb:rain at: 23-Sep-2001 0:37

On 23-Sep-01, Gregg Irwin wrote:
> Hi Joel, > > OK, it's a quick hack and fatally flawed in at least one respect, but > please take a look at it and let me know what you think. > > --Gregg > > REBOL [ > > notes: { > Joel Neely: > > OTOH, if you really want to be consistent with the philosophy > of letting the human type for human consumption and requiring > the formatting program to figure out what was meant, I'd love > to see some code that can handle the following (exactly as it > appears below, of course). Back in the day, I spent quite a > bit of time trying to come up with a bit of AI that could take > a flat file or printout image such as the following and infer > where the columns were intended, whether each column was to be > left-justfied, right-justified, or centered, and what type of > data should appear in each. It's non-trivial IMHO, but YMMV. > > Emp# First Name Last Name Nickname Pager Nr Phone Number > ==== ========== =========== ======== ======== > 12 Johannes Doe Jake 888-1001 555-1212 > 3456 Ferdinando Quattlebaum Ferdy 800-555-1214 > 234 Betty Sue Doaks 555-1213 > 4567 Sue Ellen Van Der Lin 888-1002 888-555-1215 > > Assume no leading space or trim all leading space > > Iterate over the first row > if you hit a space > drop down through that column > if you hit a non-space > not a column delimiter > if you get to the bottom, and it's all spaces > it's *probably* a column delimiter > mark column > > You could do the same kind of thing for proportional fonts > using pixel offsets in place of character offsets.
I'm no Rebol programmer and I don't know an 'AI' solution, but when I work with tables like this I read the table header line(s) first and then parse by position according to the column headings - which is pretty close to how people do it. Notice in your example above I've inserted a few additional "=" so as to make the headings unambiguous. Then they can be used to parse the lines of the table. Brent Meeker There are two ways to write error-free programs. Only the third one works.