Matching RE's

[1/7] from: d4marcus::dtek::chalmers::se at: 22-Mar-2001 23:53

Warning! This mail contains a lot of whining. Don't read it if you easily get bored. You have been warned, so bear with me. Ok, so I start out thinking this will work:

>> "" = find/case/any/match "user.r" "*.r"

== false Hmm, so the dot is a wildcard (equal to ?) and needs to be escaped:

>> "" = find/case/any/match "user.r" "*^.r"

== false So it didn't work, let's try another RE construct: = find/case/any/match "user.r" "*[.]r" == true Aha, that looks better, but let's check it first:

>> "" = find/case/any/match "user" "*[.]r"

== true Oh no! The dot is still acting as a wildcard. What to do now? Perhaps we can escape it within the blocks:

>> "" = find/case/any/match "user" "*[^.]r"

== true Nope, didn't work. Let's try something else:

>> "" = find/case/any/match "user" "*[\.]r"

== false

>> "" = find/case/any/match "user.r" "*[\.]r"

== true Aha, this could be it! But wait, perhaps... :

>> "" = find/case/any/match "user\r" "*[\.]r"

== true Ouch, it seems to be a choice between backslash and dot. Or is it?

>> "" = find/case/any/match "user r" "*[\.]r"

== true AARGH! Does it even work at all?

>> "" = find/case/any/match "user qwerty r" "*[ gargle ]r"

== true Double-AARGH! Oh well, whatever those brackets do, I better stop using them. Let's see... :

>> "" = find/case/any/match "user qwerty r" "*gargler"

== false Phew, I was starting to worry. Let's check again:

>> "" = find/case/any/match "user qwerty r" "*garglr"

== true What???!!! Oh well, before we give, better check the PDF. Hmm, seems they have put /match before /any there. Wonder why... :

>> "" = find/case/match/any "user qwerty r" "*garglr"

== true Nope. Oh well, one last try:

>> "" = find/match/any "user qwerty r" "*garglr"

== false Yes! This is it, it was a problem with /case. So we get:

>> "" = find/match/any "user.r" "*.r"

== true

>> "" = find/match/any "user" "*.r"

== false Good good. On the other hand, we also get:

>> "" = find/match/any "user.r" "*.R"

== true Not good. What if we put /case back in there:

>> "" = find/match/any/case "user.r" "*.R"

== false

>> "" = find/match/any/case "user.r" "*.r"

== false And so we're back where we started. Sigh, I guess I better write a RE to Rebol parserule translator. Unless such a thing exists already...? Some other examples:

>> find/case/any "user.R" '*r.R

== none

>> find/case/any "user.R" '*r.*

== none

>> find/case/any "user.R" '*r.r

== "user.R"

>> find/case/any "user.R" '*r.?

== "user.R" Only the last one is correct. That pretty much says it all. Marcus ------------------------------------ If you find that life spits on you calm down and pretend it's raining

[2/7] from: arolls:bigpond:au at: 23-Mar-2001 18:15

Re: Matching RE's - pattern matching

Marcus, This is what I use in my 'ls function found here: http://users.bigpond.net.au/datababies/anton/rebol/dir-utils.r This matches filenames in a directory list: foreach file list [ match: find/any/match file :pattern if match [ if (index? tail file) = (index? match) [ ; this is where you can say it actually matches if error? try [ append either dir? file [dir-list] [file-list] file ] [] ] ] ] 'pattern is a string such as "r*.r" In my quick testing the /case refinement works with this too - eg.

>> file: to-string %Rtest.blah

== "Rtest.blah"

>> match: find/case/any/match file "R*.*"

== ""

>> (index? tail file) = (index? match)

== true Regards, Anton.

[3/7] from: jelinem1:nationwide at: 23-Mar-2001 8:43

Re: Matching RE's

>> find/case/any/match "user.r" "*.r"

== ".r"

>> find/case/any/match "user.r" "*^.r"

== ".r"

>> find/any/match "user.r" "*.r"

== ""

>> index? find/any/match "user.r" "*.r"

== 7 /case seems to be presenting a problem. You might try viewing just the return from 'find, and getting this 'index, to see what is happening in abit more detail. Hopefully this will help you understand what is going on.

>> find/case/any/match "user qwerty r" "*gargler"

== none

>> find/case/any/match "user qwerty r" "*garglr"

== ""

>> find/case/any/match "user qwerty r" "*ggly"

== " r" That totally boggles my mind. After staring at it for awhile I can almost see what it's doing, but it's certainly not the behavior I initially expected. - Michael Jelinek Marcus Petersson <[d4marcus--dtek--chalmers--se]>@rebol.com on 03/22/2001 04:53:33 PM From: Marcus Petersson <[d4marcus--dtek--chalmers--se]>@rebol.com on 03/22/2001 04:53 PM Please respond to [rebol-list--rebol--com] Sent by: [rebol-bounce--rebol--com] To: [rebol-list--rebol--com] cc: Subject: [REBOL] Matching RE's Warning! This mail contains a lot of whining. Don't read it if you easily get bored. You have been warned, so bear with me. Ok, so I start out thinking this will work:

>> "" = find/case/any/match "user.r" "*.r"

== false Hmm, so the dot is a wildcard (equal to ?) and needs to be escaped:

>> "" = find/case/any/match "user.r" "*^.r"

== false So it didn't work, let's try another RE construct: = find/case/any/match "user.r" "*[.]r" == true Aha, that looks better, but let's check it first:

>> "" = find/case/any/match "user" "*[.]r"

== true Oh no! The dot is still acting as a wildcard. What to do now? Perhaps we can escape it within the blocks:

>> "" = find/case/any/match "user" "*[^.]r"

== true Nope, didn't work. Let's try something else:

>> "" = find/case/any/match "user" "*[\.]r"

== false

>> "" = find/case/any/match "user.r" "*[\.]r"

== true Aha, this could be it! But wait, perhaps... :

>> "" = find/case/any/match "user\r" "*[\.]r"

== true Ouch, it seems to be a choice between backslash and dot. Or is it?

>> "" = find/case/any/match "user r" "*[\.]r"

== true AARGH! Does it even work at all?

>> "" = find/case/any/match "user qwerty r" "*[ gargle ]r"

== true Double-AARGH! Oh well, whatever those brackets do, I better stop using them. Let's see... :

>> "" = find/case/any/match "user qwerty r" "*gargler"

== false Phew, I was starting to worry. Let's check again:

>> "" = find/case/any/match "user qwerty r" "*garglr"

== true What???!!! Oh well, before we give, better check the PDF. Hmm, seems they have put /match before /any there. Wonder why... :

>> "" = find/case/match/any "user qwerty r" "*garglr"

== true Nope. Oh well, one last try:

>> "" = find/match/any "user qwerty r" "*garglr"

== false Yes! This is it, it was a problem with /case. So we get:

>> "" = find/match/any "user.r" "*.r"

== true

>> "" = find/match/any "user" "*.r"

== false Good good. On the other hand, we also get:

>> "" = find/match/any "user.r" "*.R"

== true Not good. What if we put /case back in there:

>> "" = find/match/any/case "user.r" "*.R"

== false

>> "" = find/match/any/case "user.r" "*.r"

== false And so we're back where we started. Sigh, I guess I better write a RE to Rebol parserule translator. Unless such a thing exists already...? Some other examples:

>> find/case/any "user.R" '*r.R

== none

>> find/case/any "user.R" '*r.*

== none

>> find/case/any "user.R" '*r.r

== "user.R"

>> find/case/any "user.R" '*r.?

== "user.R" Only the last one is correct. That pretty much says it all. Marcus ------------------------------------ If you find that life spits on you calm down and pretend it's raining

[4/7] from: d4marcus:dtek:chalmers:se at: 23-Mar-2001 19:41

Re: Matching RE's - pattern matching

On Fri, 23 Mar 2001, Anton wrote:

> This is what I use in my 'ls function found here: > http://users.bigpond.net.au/datababies/anton/rebol/dir-utils.r

I downloaded that one a while ago together with shell.r (which someone else posted). Seems I forgot to check your script for an 'ls function. But still, what I get with your script is this (as I'm using Linux you might understand why it is a problem):

>> ls *.R

user.r But still, I guess I should have a more thorough look, because in one of my own recent scripts I tried to add some of that functionality. Have a look if you want: rebol [] ; for example: recurse %. '*/* creates a block structure which can be ; parsed by another script of mine recurse: func [dir [file!] pattern [any-type!]] [ pattern: either value? 'pattern [ if block? :pattern [pattern: form pattern] if string? :pattern [replace/all :pattern "/" " "] to-block :pattern] [[*]] recurse! (clean-path dirize dir) pattern ] ; compatible to Unix ls -d recurse!: function [dirpath [file!] pattern [block!]] [ result block dirnext ] [ result: copy [] foreach file sort/case read dirpath [ if match-file file first pattern [ either empty? next pattern [append result file] [ dirnext: to-file reduce [dirpath file] if dir? dirnext [ block: recurse! dirnext next pattern if not empty? block [ append result file append/only result block]]]] ] result ] match-file: function [file [file! string!] pattern [any-word! string!]] [match] [ all [match: find/case/any/match to-string file pattern parse match ["/" | end]]] Marcus ------------------------------------ If you find that life spits on you calm down and pretend it's raining

[5/7] from: arolls:bigpond:au at: 24-Mar-2001 14:29

Marcus, you wrote:

> > This is what I use in my 'ls function found here: > > http://users.bigpond.net.au/datababies/anton/rebol/dir-utils.r

<<quoted lines omitted: 4>>

> >> ls *.R > user.r

Ok, I didn't think about that. I have added a /case refinement to 'ls which shows the same problem as you are having. (Download from above.) Your match-file function seems to do the same thing as my match code. I want to check out your recurse as well. It's neater than my tree. I implemented a stack for mine. :) I think it could be time to write our own matching parse rules. I was thinking this already but never got around to it. Anton.

[6/7] from: d4marcus:dtek:chalmers:se at: 24-Mar-2001 18:53

Re: Matching RE's

On Fri, 23 Mar 2001 [jelinem1--nationwide--com] wrote:

> >> find/case/any/match "user qwerty r" "*ggly" > == " r" > That totally boggles my mind. After staring at it for awhile I can almost > see what it's doing, but it's certainly not the behavior I initially

Yes, I think I can too. But it's almost the opposite to what it really should do: Compare these ones:

>> find/case/any/match "user" "*e"

== "r"

>> find/case/any/match "user" "*se"

== none

>> find/case/any/match "user" "*ae"

== "r" If there's more than one character to match, it returns none only if it matches, otherwise the next one. Opposite! Trying again:

>> find/case/any/match "user" "*user"

== none

>> find/case/any/match "user" "*usar"

== none

>> find/case/any/match "user" "*uaar"

== none

>> find/case/any/match "user" "*aaar"

== "" No character except the last one must match, otherwise it returns none. Seems they have inverted the logic somehow. I probably should report this as a bug. Marcus ------------------------------------ If you find that life spits on you calm down and pretend it's raining

[7/7] from: d4marcus::dtek::chalmers::se at: 24-Mar-2001 19:07

Re: Re(2): Matching RE's - pattern matching

On Sat, 24 Mar 2001, Anton wrote:

> I want to check out your recurse as well. > It's neater than my tree. I implemented > a stack for mine. :)

I have made a tree function as well, which of course uses a stack too. Btw, this tree function is actually made of two functions, one called tree-builder and one called tree-parser. The tree function the looks like this: set 'Tree func ["Print directory tree" dir [file! unset!] "Directory to list" ] [ any [value? 'dir dir: %.] tree-init tree-parser tree-builder dir ()] tree-init initialises the rules to use when parsing the tree. This way you can easily make several function with various effects, like cloning a directory structure, running all scripts matching .r etc... Eventually tree-builder will be able to take patterns too, like recurse does, but without paths like */* becuase paths make no sense for tree.

> I think it could be time to write our own > matching parse rules. I was thinking this > already but never got around to it.

It would be good enough just being able to translate "?*" to some 'parse rule. Marcus ------------------------------------ If you find that life spits on you calm down and pretend it's raining

Notes

Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted