Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

Parsing for fun and profit

 [1/7] from: learned:talentsinc at: 24-Sep-2001 13:43


I've been working with the parsing routines for a little utility project I've inherited, and while I'm coming up with some brute force solutions, I'd love to hear any suggestions for streamlineing the code. The data I am inputing has assorted formats embedded within the text files I have to read. I've included an example below: --- SYS SER #: 123 VER: 4 REV: 7a Processor SER: 3.2 Slot 1 Slot 2 Slot 3 Slot 4 Board 123 123 123 123 Rev 2.3 2.3 2.3 3.4 Drive Information: Slot Manuf Type Rev Serial # ---- -------- ---------------- ---- ------------ A0 MYXPTL XXX786768XX 1.2 8687163816SA A1 MYXPTL XXX786768XX 1.2 8687163817SA What I have been doing is double parsing to get the information. For example, to get the REV off the first line, the parse is: thru "REV:" copy vREV to "Processor" vREV: parse vREV none the purpose of the second parse being to get rid of whitespace. In the case of the table information, for example, Board, I am doing the following: thru "Board" copy vBoard to "Rev" vBoard: parce vBoard none the second parse in this case serves to return each 'slot' as a seperatly addressable data block (i.e. vBoard/1). I haven't started parsing the last section yet, which would have 5 pieces of information for each drive found. Are there easier ways to get to where I'm going, or am I on the right track in trying to "think REBOL"? Thanks Gary --- G. Edw. Learned - [learned--talentsinc--net] (Never apply a Star Trek Solution to a Babylon Five Problem)

 [2/7] from: lmecir:mbox:vol:cz at: 24-Sep-2001 23:17


Hi Gary, I think, that the "double parsing" may be convenient in some cases. If you really know, that the whitespace characters serve only as "delimiters" through the whole text, then there is a possibility to "swap" the order of operation and write the parse rules more efficiently like this: preparsed: parse data none parse preparsed [ thru "REV:" copy vRev to "Processor" thru "Board" copy vBoard to "Rev" ] etc. In the case the whitespace could have a different "meaning" somewhere, I would suggest to use a different approach, i.e. to use only single parse. Regards Ladislav

 [3/7] from: learned::talentsinc::net at: 24-Sep-2001 16:31


Ok, I'll buy that, how about in the final section of the file, where I have a 1 to n (where N < 100) row table of 5 values each. 123 234 345 46 56469u0 111 234 234 25 8798jh9 I this case, I'd love to end up with a two dimensional array of [n 5], but I haven't figured out how to swing that one yet. Seems like it should be easy, but I have my brain wrapped around my axle. --On Monday, September 24, 2001 11:17 PM +0200 Ladislav Mecir <[lmecir--mbox--vol--cz]> wrote:
> Hi Gary, > I think, that the "double parsing" may be convenient in some cases. If you
<<quoted lines omitted: 76>>
> [rebol-request--rebol--com] with "unsubscribe" in the > subject, without the quotes.
--- G. Edw. Learned - [learned--talentsinc--net] (Never apply a Star Trek Solution to a Babylon Five Problem)

 [4/7] from: lmecir:mbox:vol:cz at: 25-Sep-2001 0:37


OK, how about this: preparsed: parse data none parse preparsed [ thru "REV:" copy vRev to "Processor" thru "Board" copy vBoard to "Rev" thru "#" (vSerial: copy []) 5 string! any [ copy vs 5 string! (append/only vSerial vs) ] ] Cheers Ladislav

 [5/7] from: learned:talentsinc at: 24-Sep-2001 21:12


You totally lost me on your last bit of code...I'm digging thru it now...to simplify, let's take a block that only has the last section and say we want to get it into a 2-dimensial array. So, for example: blob: 123 abd 123456788 jk234234f 12345 234 kjl 124kjlkj werwerr35 234jk 253 klj klj343jk klwj32543 jkljl . . . 853 lk3 2342342 23424lk kjlkjlkj So, we have a blob of n rows, each with 5 elements inside of it. The goal is to put it into an array such as: blobarray: array [x 5] Sorry I'm being dense. Gary --On Tuesday, September 25, 2001 12:37 AM +0200 Ladislav Mecir <[lmecir--mbox--vol--cz]> wrote:
> OK, > how about this:
<<quoted lines omitted: 118>>
> [rebol-request--rebol--com] with "unsubscribe" in the > subject, without the quotes.
--- G. Edw. Learned ( [learned--talentsinc--net] ) (Never apply a Star Trek Solution to a Babylon 5 Problem)

 [6/7] from: lmecir:mbox:vol:cz at: 25-Sep-2001 8:18


Hi, it was a "double parsing" solution. A "double parsing" solution of your simplified task may be: preparsed: parse blob none parse preparsed [ (blobarray: copy []) any [ copy row 5 string! (append/only blobarray row) ] ] If you prefer a "single parsing" solution, then it could be like: nonspace: complement charset " " parse/all blob [ (blobarray: copy []) any [ (row: copy []) 5 [ any " " copy string any nonspace (append row string) ] (append/only blobarray row) thru newline ] ] Cheers Ladislav

 [7/7] from: lmecir:mbox:vol:cz at: 25-Sep-2001 8:59


Hi, I wrote:
> If you prefer a "single parsing" solution, then it could be like: > nonspace: complement charset " "
<<quoted lines omitted: 8>>
> ] > ]
, which was untested. A corrected version: nonspace: complement charset " ^/^(tab)" space: charset " ^(tab)" parse/all blob [ (blobarray: copy []) any [ (row: copy []) 5 [ any space copy string some nonspace (append row string) ] (append/only blobarray row) thru newline ] ]

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted