Parsing for fun and profit
[1/7] from: learned:talentsinc at: 24-Sep-2001 13:43
I've been working with the parsing routines for a little utility project
I've inherited, and while I'm coming up with some brute force solutions,
I'd love to hear any suggestions for streamlineing the code. The data I am
inputing has assorted formats embedded within the text files I have to
read. I've included an example below:
---
SYS SER #: 123 VER: 4 REV: 7a
Processor
SER: 3.2
Slot 1 Slot 2 Slot 3 Slot 4
Board 123 123 123 123
Rev 2.3 2.3 2.3 3.4
Drive Information:
Slot Manuf Type Rev Serial #
---- -------- ---------------- ---- ------------
A0 MYXPTL XXX786768XX 1.2 8687163816SA
A1 MYXPTL XXX786768XX 1.2 8687163817SA
What I have been doing is double parsing to get the information. For
example, to get the REV off the first line, the parse is:
thru "REV:" copy vREV to "Processor"
vREV: parse vREV none
the purpose of the second parse being to get rid of whitespace.
In the case of the table information, for example, Board, I am doing the
following:
thru "Board" copy vBoard to "Rev"
vBoard: parce vBoard none
the second parse in this case serves to return each 'slot' as a
seperatly addressable data block (i.e. vBoard/1).
I haven't started parsing the last section yet, which would have
5 pieces of information for each drive found.
Are there easier ways to get to where I'm going, or am I on the
right track in trying to "think REBOL"?
Thanks
Gary
---
G. Edw. Learned - [learned--talentsinc--net]
(Never apply a Star Trek Solution to a Babylon Five Problem)
[2/7] from: lmecir:mbox:vol:cz at: 24-Sep-2001 23:17
Hi Gary,
I think, that the "double parsing" may be convenient in some cases. If you
really know, that the whitespace characters serve only as "delimiters"
through the whole text, then there is a possibility to "swap" the order of
operation and write the parse rules more efficiently like this:
preparsed: parse data none
parse preparsed [
thru "REV:" copy vRev to "Processor"
thru "Board" copy vBoard to "Rev"
]
etc. In the case the whitespace could have a different "meaning" somewhere,
I would suggest to use a different approach, i.e. to use only single parse.
Regards
Ladislav
[3/7] from: learned::talentsinc::net at: 24-Sep-2001 16:31
Ok, I'll buy that, how about in the final section of the file, where I have
a 1 to n (where N < 100) row table of 5 values each.
123 234 345 46 56469u0
111 234 234 25 8798jh9
I this case, I'd love to end up with a two dimensional array
of [n 5], but I haven't figured out how to swing that one yet.
Seems like it should be easy, but I have my brain wrapped around
my axle.
--On Monday, September 24, 2001 11:17 PM +0200 Ladislav Mecir
<[lmecir--mbox--vol--cz]> wrote:
> Hi Gary,
> I think, that the "double parsing" may be convenient in some cases. If you
<<quoted lines omitted: 76>>
> [rebol-request--rebol--com] with "unsubscribe" in the
> subject, without the quotes.
---
G. Edw. Learned - [learned--talentsinc--net]
(Never apply a Star Trek Solution to a Babylon Five Problem)
[4/7] from: lmecir:mbox:vol:cz at: 25-Sep-2001 0:37
OK,
how about this:
preparsed: parse data none
parse preparsed [
thru "REV:" copy vRev to "Processor"
thru "Board" copy vBoard to "Rev"
thru "#" (vSerial: copy []) 5 string!
any [
copy vs 5 string! (append/only vSerial vs)
]
]
Cheers
Ladislav
[5/7] from: learned:talentsinc at: 24-Sep-2001 21:12
You totally lost me on your last bit of code...I'm digging thru it now...to
simplify, let's take a block that only has the last section and say we want
to get it into a 2-dimensial array. So, for example:
blob:
123 abd 123456788 jk234234f 12345
234 kjl 124kjlkj werwerr35 234jk
253 klj klj343jk klwj32543 jkljl
.
.
.
853 lk3 2342342 23424lk kjlkjlkj
So, we have a blob of n rows, each with 5 elements inside of it. The goal
is to put it into an array such as:
blobarray: array [x 5]
Sorry I'm being dense.
Gary
--On Tuesday, September 25, 2001 12:37 AM +0200 Ladislav Mecir
<[lmecir--mbox--vol--cz]> wrote:
> OK,
> how about this:
<<quoted lines omitted: 118>>
> [rebol-request--rebol--com] with "unsubscribe" in the
> subject, without the quotes.
---
G. Edw. Learned ( [learned--talentsinc--net] )
(Never apply a Star Trek Solution to a Babylon 5 Problem)
[6/7] from: lmecir:mbox:vol:cz at: 25-Sep-2001 8:18
Hi,
it was a "double parsing" solution. A "double parsing" solution of your
simplified task may be:
preparsed: parse blob none
parse preparsed [
(blobarray: copy [])
any [
copy row 5 string! (append/only blobarray row)
]
]
If you prefer a "single parsing" solution, then it could be like:
nonspace: complement charset " "
parse/all blob [
(blobarray: copy [])
any [
(row: copy [])
5 [
any " " copy string any nonspace (append row string)
] (append/only blobarray row)
thru newline
]
]
Cheers
Ladislav
[7/7] from: lmecir:mbox:vol:cz at: 25-Sep-2001 8:59
Hi,
I wrote:
> If you prefer a "single parsing" solution, then it could be like:
> nonspace: complement charset " "
<<quoted lines omitted: 8>>
> ]
> ]
, which was untested. A corrected version:
nonspace: complement charset " ^/^(tab)"
space: charset " ^(tab)"
parse/all blob [
(blobarray: copy [])
any [
(row: copy [])
5 [
any space copy string some nonspace (append row string)
] (append/only blobarray row)
thru newline
]
]
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted