Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] parse or Re:(6)

From: joel:neely:fedex at: 21-Sep-2000 7:03

Hi, Ryan... [RChristiansen--pop--isdfa--sei-it--com] wrote:
[snip]
> You missed another option, which I had been using previously. Here > is the function: >
[snip]
> In other words, replace all instances of a set of characters with > a new character that can be recognized later... >
You're absolutely right! Thanks for catching my omission. I often use that trick when html-izing text, with the cliche below: replace/all chunk-a-text {&} {@@@} replace/all chunk-a-text {<} {<} replace/all chunk-a-text {>} {>} replace/all chunk-a-text {"} {&quot;} replace/all chunk-a-text {@@@} {&amp;} where the trick, of course, is to hide the ampersands before replacing dangerous characters with entities that are escaped by ampersands. To make up for my omission (and my lack of time/energy last night), the following is a scheme for using parse to attack the paragraphing problem you posted. I know it doesn't handle every possible case, but I think it can be generalized in a fairly obvious way. Enjoy! -jn- (Output first, as an appetizer! I'll leave the code un-indented to facilitate cut-and-paste.) =====================================================================
>> do %parsetest.r
----- First paragraph ends here. ----- ----- A sentence. End of second "paragraph." ----- ----- I'm not sure. Is this the third paragraph? ----- ----- The fourth paragraph contains some embedded linebreaks along the way. Will this work? ----- ----- Another sentence. A "quotation." The end! ----- ----- Well, maybe! -----
>>
===================================================================== REBOL [] paragraphs: {First paragraph ends here. A sentence. End of second "paragraph." I'm not sure. Is this the third paragraph? The fourth paragraph contains some embedded linebreaks along the way. Will this work? Another sentence. A "quotation." The end! Well, maybe!} parblock: copy [] currpar: copy "" stopper: charset {.?!} nonstop: complement stopper fragment: copy "" sent: [ copy fragment [any nonstop stopper] (append currpar fragment) [{^/} (append parblock currpar currpar: copy "") |{"^/} (append parblock append currpar {"} currpar: copy "") | none ] ] parg: [ (parblock: copy [] currpar: copy "") any sent end (if 0 < length? currpar [append parblock currpar]) ] parse/all paragraphs parg foreach currpar parblock [ print ["-----^/" currpar "^/-----"] ]