Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: 5 simple pattern matching questions

From: g:santilli:tiscalinet:it at: 16-Sep-2000 18:11

Hello [princepawn--MailAndNews--com]! On 15-Set-00, you wrote: p> I am having problems switching my understanding of regular p> expressions to the REBOL parse dialect. REBOL's parse dialect is mainly designed to parse "grammars", instead of doing pattern matching. So, there are a lot of things that are very simple to do with a regexps and quite difficult with PARSE, but there are also a lot of things that are incredibly simple to do with PARSE but almost impossible with regexps. p> 1. match "cat" at the beginning of a line lines: [ "cat" (print "found cat") thru newline | thru newline ] parse/all string [some lines] p> 2. match "cat", p> immediately preceded and followed by a word boundary , e.g., p> match "the cat in" or "the cat" but not "mercata" The simplest way:
>> found? find parse "the cat in" ",;.:!()?" "cat"
== true
>> found? find parse "the cat" ",;.:!()?" "cat"
== true
>> found? find parse "mercata" ",;.:!()?" "cat"
== false Using the parse dialect only: text: [some words] words: [ "cat" separator (print "found cat") | some word-char separator ] separator: [some sep-char | end] word-char: complement sep-char: charset " ,;.:!()?"
>> parse/all "the cat in" text
found cat == true
>> parse/all "the cat" text
found cat == true
>> parse/all "mercata" text
== true p> 3. match "cat" on a line all by itself Similar to 1.: lines: [ "cat" newline (print "found cat") | thru newline ] parse/all string [some lines] (Omit the /ALL refinement if you don't care about spaces.) p> 4. match the empty string: I think this is p> parse string "" Actually is PARSE/ALL STRING [END]. p> 5. match any char: I think this is done by creating a bitset p> from a charset from hex 000 to hex 255 and parsing on that, p> but it doesnt work, e.g., p> bset: charset [ #"^(00)" - #"^(FF)" ] p> parse " " [ some bset ] SKIP will match any char. So [SOME SKIP] will go to the end (like [TO END]). Anyway, the above works too: it's just that without /ALL PARSE ignores spaces, so treats " " as an empty string; an empty string does not contain any character... HTH, Gabriele. -- Gabriele Santilli <[giesse--writeme--com]> - Amigan - REBOL programmer Amiga Group Italia sez. L'Aquila -- http://www.amyresource.it/AGI/