[REBOL] parse or Re:(6)
From: joel:neely:fedex at: 21-Sep-2000 7:03
Hi, Ryan...
[RChristiansen--pop--isdfa--sei-it--com] wrote:
[snip]
> You missed another option, which I had been using previously. Here
> is the function:
>
[snip]
> In other words, replace all instances of a set of characters with
> a new character that can be recognized later...
>
You're absolutely right! Thanks for catching my omission. I often
use that trick when html-izing text, with the cliche below:
replace/all chunk-a-text {&} {@@@}
replace/all chunk-a-text {<} {<}
replace/all chunk-a-text {>} {>}
replace/all chunk-a-text {"} {"}
replace/all chunk-a-text {@@@} {&}
where the trick, of course, is to hide the ampersands before replacing
dangerous characters with entities that are escaped by ampersands.
To make up for my omission (and my lack of time/energy last night),
the following is a scheme for using parse to attack the paragraphing
problem you posted. I know it doesn't handle every possible case, but
I think it can be generalized in a fairly obvious way.
Enjoy!
-jn-
(Output first, as an appetizer! I'll leave the code un-indented to
facilitate cut-and-paste.)
=====================================================================
>> do %parsetest.r
-----
First paragraph ends here.
-----
-----
A sentence. End of second "paragraph."
-----
-----
I'm not sure. Is this the third paragraph?
-----
-----
The fourth paragraph
contains some embedded
linebreaks
along the way. Will
this work?
-----
-----
Another sentence. A "quotation." The end!
-----
-----
Well, maybe!
-----
>>
=====================================================================
REBOL []
paragraphs: {First paragraph ends here.
A sentence. End of second "paragraph."
I'm not sure. Is this the third paragraph?
The fourth paragraph
contains some embedded
linebreaks
along the way. Will
this work?
Another sentence. A "quotation." The end!
Well, maybe!}
parblock: copy []
currpar: copy ""
stopper: charset {.?!}
nonstop: complement stopper
fragment: copy ""
sent: [
copy fragment [any nonstop stopper]
(append currpar fragment)
[{^/} (append parblock currpar currpar: copy "")
|{"^/} (append parblock append currpar {"} currpar: copy "")
| none
]
]
parg: [
(parblock: copy [] currpar: copy "")
any sent end
(if 0 < length? currpar [append parblock currpar])
]
parse/all paragraphs parg
foreach currpar parblock [
print ["-----^/" currpar "^/-----"]
]