[REBOL] Re: R: Re: Help on parsing
From: tomc:darkwing:uoregon at: 15-Mar-2004 20:36
Hi Giuseppe Chillemi
what you are asing for is not too much
I cant be sure where your lines were suppose to end but
I assume it is after the <br> after the phone number.
one way to be sure that you do not skip over too much looking for the fax
number is to break it into individual lines when you read it.
(if it is stored with a record per row) sonething like:
foreach line read/lines %file [parse line rule]
but you can also parse the whole thing by building a parse rule that looks
at each line one at a time. the rule below could be made more flexible by
writing rules to handle whitespace i.e. ws: [[any " "] | [any tab]]
and sticking it in different places. but hopefully this will help.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; the sequence of "N" mean: 3 or more numbers to
;;; undefined number of numbers
digit: charset {0123456789}
phone: [3 digit any digit " " 3 digit some digit]
;;; could also say
;;; phone: [3 4 digit " " 7 9 digit]
;;; or what ever number, if you knew the ranges
;;; an object to store a record ... could also use a simple block
mark: make object! [
name: copy ""
address: copy ""
phone: copy ""
fax: copy ""
]
;;; a block to store the objects in
marks: copy []
;;; parse rule for a line -- I just use the word 'token' out of habit
line: [ (m: make mark[])
any newline ; there may not be one at the start/end
"name-keyword " copy token to " " (m/name: token) " "
thru "address-keyword" copy token to <br> (m/address: token) <br>
thru "Tel.: " copy token phone (m/phone: token)
opt [" - Fax.: " copy token phone (m/fax: token)]
any " "
<br>
(append marks m)
]
;;; mind the wrap
page:
{name-Keyword NAME-VALUE unusefulltext address-keyword ADDRESS-VALUE <BR>
unusefulltext Tel.: 1234 12345678 <BR>
name-Keyword NAME-VALUE unusefulltext address-keyword ADDRESS-VALUE <br>
unusefulltext Tel.: 1234 567891011 - Fax.: 1234 110198765 <BR>
}
parse/all page [some line]
probe marks
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
On Sun, 14 Mar 2004, Giuseppe Chillemi wrote: