Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Not-too-smart Table Parser (was: how to handle tables?)

From: greggirwin::starband::net at: 23-Sep-2001 13:05

Hi Joel, OK, it's a quick hack and fatally flawed in at least one respect, but please take a look at it and let me know what you think. --Gregg REBOL [ notes: { Joel Neely: OTOH, if you really want to be consistent with the philosophy of letting the human type for human consumption and requiring the formatting program to figure out what was meant, I'd love to see some code that can handle the following (exactly as it appears below, of course). Back in the day, I spent quite a bit of time trying to come up with a bit of AI that could take a flat file or printout image such as the following and infer where the columns were intended, whether each column was to be left-justfied, right-justified, or centered, and what type of data should appear in each. It's non-trivial IMHO, but YMMV. Emp# First Name Last Name Nickname Pager Nr Phone Number ==== ===== ==== ==== ====== ======== ===== == ===== ====== 12 Johannes Doe Jake 888-1001 555-1212 3456 Ferdinando Quattlebaum Ferdy 800-555-1214 234 Betty Sue Doaks 555-1213 4567 Sue Ellen Van Der Lin 888-1002 888-555-1215 Assume no leading space or trim all leading space Iterate over the first row if you hit a space drop down through that column if you hit a non-space not a column delimiter if you get to the bottom, and it's all spaces it's *probably* a column delimiter mark column You could do the same kind of thing for proportional fonts using pixel offsets in place of character offsets. } ] ; 5 16 28 37 43 46 52 data: [ {Emp# First Name Last Name Nickname Pager Nr Phone Number} {==== ===== ==== ==== ====== ======== ===== == ===== ======} { 12 Johannes Doe Jake 888-1001 555-1212} {3456 Ferdinando Quattlebaum Ferdy 800-555-1214} { 234 Betty Sue Doaks 555-1213} {4567 Sue Ellen Van Der Lin 888-1002 888-555-1215} ] mid: func [s start len][return copy/part at s start len] longest?: func [items /local item result] [ result: 0 foreach item items [ result: max result length? item ] return result ] find-columns: func [tbl-data /local i j ch ch2 found-col? result tmp-cols] [ result: make block! 10 tmp-cols: make block! length? tbl-data/1 for i 1 length? tbl-data/1 1 [ ch: pick tbl-data/1 i if any [(ch = #" ") (ch = #"^-")] [append tmp-cols i] ] foreach i tmp-cols [ found-col?: true foreach row next tbl-data [ ch: pick row i if all [(ch <> #" ") (ch <> #"^-")] [ found-col?: false break ] ] if found-col? [ append result i ] ] return result ] build-col: func [tbl-data start end /local row result] [ result: make block! length? tbl-data foreach row tbl-data [ append result trim mid row start (end - start + 1) ] return result ] split-to-cols: func [tbl-data /local col-offsets result] [ col-offsets: find-columns tbl-data insert head col-offsets 0 append col-offsets longest? tbl-data result: make block! (length? col-offsets) - 1 for i 1 ((length? col-offsets) - 1) 1 [ append/only result build-col tbl-data ((pick col-offsets i) + 1) pick col-offsets (i + 1) ] return result ] build-row: func [row-data col-offsets /local start end result] [ result: make block! length? col-offsets for i 1 ((length? col-offsets) - 1) 1 [ start: (pick col-offsets i) + 1 end: pick col-offsets (i + 1) append result trim mid row-data start (end - start + 1) ] return result ] split-to-rows: func [tbl-data /local col-offsets result] [ col-offsets: find-columns tbl-data insert head col-offsets 0 append col-offsets longest? tbl-data result: make block! (length? col-offsets) - 1 foreach row tbl-data [ append/only result build-row to-string row col-offsets ] return result ] ;print ["Column Offsets:" find-columns data] ;print ["Longest Item:" longest? data] ;print mold build-col data 1 5 ;print mold build-col data 6 16 ;print mold split-to-cols data ;print mold build-row to-string data/1 find-columns data ;print mold split-to-rows data foreach row split-to-rows data [print mold row] halt