Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Parse versus Regular Expressions

From: joel:neely:fedex at: 4-Apr-2003 17:51

Hi, Ladislav, As usual, you've provided much food for thought. Ladislav Mecir wrote:
> Yes, PARSE dialect looks procedural. Nevertheless, all the > "language stuff" is a hybrid between procedural and declarative > descriptions: e.g. Regular Expressions and FSM's describe the > same - Regular Languages. > > Similarly Grammars (more declarative) and Turing Machines > (procedural) describe the same languages. >
...
> If you really want a symmetrical OR parse rule, it can be programmed: >
I'm quite willing to be educated, so here's another example. I'll state the problem, then wait for suggestions on how to use PARSE before I show the solution (using Python's RE engine). Lest I be accused of theory, I'll point out that this is a disguised and simplified version of a program I've recently written for Real Work. GIVEN: A file of lines, each of which is 80 characters, and contains: 1) a six-digit leading sequence number, 2) a 66-character body area, 3) an eight-digit trailing sequence number. The body area contains sentences, which end in a period followed by whitespace. A sentence may spread across the body areas of one or more lines, but if a sentence ends on one line, the rest of that body will be blank and the next sentence will begin in a subsequent line. If the body area begins with an asterisk, it is to be ignored. Consecutive lines should have both leading and trailing sequence numbers that are in order. OUTPUT: If any line has an out-of-order leading or trailing sequence number, echo that line to output as an error. Output whole sentences (with redundant whitespace removed) as individual lines of output. To illustrate (although my lines here aren't 80 bytes, to avoid email line wrap and save typing): 00001 This is a sentence. 20030301 00002 So 20030302 00003 is 20030302 00004 this. 20030302 00005* this will disappear 20030303 00006* but this won't because of sequence order 20030101 00007 The last 20030304 00008 sentence. 20030304 should get output like this This is a sentence. So is this. ERROR: 000006* but this won't because of sequenc... The last sentence. Thanks in advance to anyone who offers PARSE solutions! -jn- -- ---------------------------------------------------------------------- Joel Neely joelDOTneelyATfedexDOTcom 901-263-4446 Counting lines of code is to software development as counting bricks is to urban development.