Problems with parsing

[1/13] from: peter:carlsson:space:se at: 29-Nov-2001 14:28

Hello! Any suggestions on how to make a parse rule for text NOT including a special pattern? Crystal clear?!? Best regards, Peter Carlsson ---------------------------------------------------------------- Peter Carlsson Tel: +46 31 735 45 26 Saab Ericsson Space AB Fax: +46 31 735 40 00 S-405 15 G�teborg Email: [peter--carlsson--space--se] SWEDEN URL: http://www.space.se

[2/13] from: mario:cassani:icl at: 29-Nov-2001 16:20

Hallo Peter,

> Any suggestions on how to make a parse rule for text > NOT including a special pattern?

'parse returns 'true or 'false if the rule is or isn't matched so, if you make a rule to check if the pattern exists: not parse given-text pattern-rule will be 'true if the pattern is not included.

> Crystal clear?!?

Hope this helps. If you share the piece of code making you mad maybe helping you will be easiear as 'parse is a beast to be tamed playing with the REBOL console and some sample data and rules. Mario

[3/13] from: brett:codeconscious at: 30-Nov-2001 11:41

Hi Peter,

> Any suggestions on how to make a parse rule for text > NOT including a special pattern? > >Crystal clear?!?

Somewhat. It would be better to provide an example, because now I have to show the solution *and* design an example :) I have come across this sort of problem a few times and I thank Ladislav for showing me a solution. One example where you might do this is when you have a sub rule that might consume something needed by an enclosing rule. For my example, I'll parse a block rather than text but the concept still applies. I want to parse the following block, and print out every word, but if I encounter a "|" I'll print out the text ********** : my-block: [ the quick brown fox | jumped | over the lazy] This next bit of code will not work. If you try it you will see that there are no "*"s printed, instead you will see the "|": single-word: [set item word! (print mold item)] phrase: [some single-word] parse my-block [ phrase some ['| (print "**********") phrase] ] The thing to note is that | is a word too. Therefore the "|" is "consumed" by the rule called SINGLE-WORD. So one way to solve this is to give SINGLE-WORD some indigestion (make it fail) when it encounters a "|". To do this I will use a dynamic rule - a rule that is modified as parse is executing. To force a rule to fail, make sure it cannot match anything any more. A way to ensure this is to skip past the end of the input. This next rule is guaranteed to fail every time: always-fails: [to end skip] Using this I now wrap SINGLE-WORD with a rule I call WORD-EXCEPT-BAR. The purpose of this new rule is to fail if it finds the "|" word otherwise it goes ahead and runs SINGLE-WORD. I also need to modify PHRASE to call WORD-EXCEPT-BAR: The dynamic rule I mentioned earlier is called WEB. Here are rules with the complex one split over multiple lines to improve readability: phrase: [some word-except-bar] word-except-bar: [ [ '| (web: :always-fails) | (web: :single-word) ] web ] To finish off I'll create a function to call parse with the correct rule and wrap the whole lot in an object just to be tidy: word-parsing-object: context [ always-fails: [to end skip] single-word: [set item word! (print mold item)] word-except-bar: [ ['| (web: :always-fails) | (web: :single-word)] web ] phrase: [some word-except-bar] set 'parse-words func[ a-block [block!] ] [ parse my-block [ phrase some ['| (print "**********") phrase] ] ] ] Here is a test run:

>> parse-words [the quick brown fox | jumped | over the lazy]

the quick brown fox ********** jumped ********** over the lazy == true HTH Brett.

[4/13] from: rotenca:telvia:it at: 30-Nov-2001 2:52

Hi Brett, why not: rule: [any ['| (print "*******") opt rule | set item word! (print mold item)]] parse my-block rule this, by design, return true also for a void block. --- Ciao Romano

[5/13] from: brett:codeconscious at: 30-Nov-2001 13:52

Hi Romano, Thanks for your post. It is a good demonstration of how a problem can often be thought about differently and therefore solved. Something important to keep in mind with creating parse rules. Something to think about regarding your solution is the ways that it is not equivalent to mine. You already pointed out that yours returns true for a void block by design. But yours also returns true when '| is the first word in the block. Also by design? :) It doesn't really matter if was or wasn't by design, but it might be interesting to work out how you would change your rule to ensure that '| is not the first word in the block. However, my purpose wasn't to show how my example block could be parsed. Peter asked for a rule that matched text NOT including a special pattern. NOT is a useful operator in logic, I wonder why it is not in Parse as a dialect keyword. That is, would it not be nice to have the following statement return true? parse [b] [not 'a] Ladislav orginally solved this problem when I asked about it before. He has some parse enhancements on his rebsite in the script called parseen.r. Worth a look. I could have saved some typing by responding to Peter that his question is answered in parseen.r by Ladislav - though you may need to look twice or thrice and learn something new to follow it - as is typical of Ladislav's work ;-) Regards, Brett.

[6/13] from: rotenca:telvia:it at: 30-Nov-2001 12:32

Hi, Brett

> Something to think about regarding your solution is the ways that it is not > equivalent to mine. You already pointed out > that yours returns true for a void block by design. But yours also returns > true when '| is the first word in the block. Also > by design? :)

No, of course :-)

> It doesn't really matter if was or wasn't by design, but it > might be interesting to work out how you would change your rule to ensure > that '| is not the first word in the block.

rule1: [some ['| (print "*******") opt rule1 | set item word! (print mold item)]] rule: [h: opt ['| (h: tail h)] :h rule1] parse block rule

> However, my purpose wasn't to show how my example block could be parsed. > Peter asked for a rule that matched text NOT including a special pattern.

I understand. My idea is that one should match first the special pattern and take some consequent actions, like to put the input index to the end of block and then asking at least any-type!.

> Ladislav orginally solved this problem when I asked about it before. He has > some parse enhancements on his rebsite in the script called parseen.r. Worth > a look. I could have saved some typing by responding to Peter that his > question is answered > in parseen.r by Ladislav - though you may need to look twice or thrice and > learn something new to follow it - as is typical of Ladislav's work ;-)

But there is at least one little problem with Ladislav's not-rule:

>> nr: not-rule [1]

== [[[1] (finish: [end skip]) | (finish: [])] finish]

>> parse [1] nr

== false

>> parse [2] nr

== false

> Regards, > Brett.

--- Ciao Romano

[7/13] from: lmecir:mbox:vol:cz at: 30-Nov-2001 14:12

Hi Romano, <<Romano>> But there is at least one little problem with Ladislav's not-rule:

>> nr: not-rule [1]

== [[[1] (finish: [end skip]) | (finish: [])] finish]

>> parse [1] nr

== false

>> parse [2] nr

== false

> Regards, > Brett.

--- Ciao Romano <</Romano>> This is not a problem with my rule, it is by design. If you want to match a block that doesn't match the Rule: [integer!] at the start, you have to use it as follows: rule: [integer!] nr: not-rule rule parse [1 a] [nr to end] parse [a 1] [nr to end] If you want to match a block that doesn't match the rule anywhere, you have to write it as follows: parse [a b c 1 d] [any [nr skip]] parse [a b c d e] [any [nr skip]] HTH Ladislav

[8/13] from: lmecir:mbox:vol:cz at: 30-Nov-2001 14:28

Hi once again, I am adding a variation on Romano's example to show how it can be solved: nr: not-rule [1 1 1] parse [1] [any [nr skip]] parse [2] [any [nr skip]] HTH Ladislav <<Romano>> But there is at least one little problem with Ladislav's not-rule:

>> nr: not-rule [1]

== [[[1] (finish: [end skip]) | (finish: [])] finish]

>> parse [1] nr

== false

>> parse [2] nr

== false

> Regards, > Brett.

[9/13] from: peter:carlsson:space:se at: 30-Nov-2001 8:33

Hello! To all who helped me out with the parsing problem I finally found a solution by myself. Thanks a lot anyway! Best regards, Peter Carlsson ---------------------------------------------------------------- Peter Carlsson Tel: +46 31 735 45 26 Saab Ericsson Space AB Fax: +46 31 735 40 00 S-405 15 G�teborg Email: [peter--carlsson--space--se] SWEDEN URL: http://www.space.se

[10/13] from: mario::cassani::icl::com at: 30-Nov-2001 14:16

Hallo Peter,

> To all who helped me out with the parsing problem I > finally found a solution by myself. > > Thanks a lot anyway!

can you please share it with us if it's different? Zaijian Mario

[11/13] from: lmecir:mbox:vol:cz at: 30-Nov-2001 15:42

Hi all, I just uploaded a newer version of %parseen.r to my Rebsite (Sites/Ladislav). It contains more examples, I left out the unnecessary A-B rule and returned to iterative version of To-rule (the recursive version was limited by the size of input). Cheers Ladislav

[12/13] from: rotenca:telvia:it at: 1-Dec-2001 17:01

Hi Ladislav,

>> But there is at least one little problem with Ladislav's not-rule: > This is not a problem with my rule, it is by design.

I did not say it was a bug. Only that it must be adapted to the goal you want to reach. And your examples confirm my idea. :-) --- Ciao Romano

[13/13] from: peter:carlsson:space:se at: 3-Dec-2001 7:31

At 16:00 2001-11-30 +0100, you wrote:

>Hallo Peter, > > > To all who helped me out with the parsing problem I > > finally found a solution by myself. > > > > Thanks a lot anyway! > > can you please share it with us if it's different?

Well, I used another way to parse where I included some more rules. That's all. But I will have a look at the suggestions and see if I could use these instead. Best regards, Peter Carlsson ---------------------------------------------------------------- Peter Carlsson Tel: +46 31 735 45 26 Saab Ericsson Space AB Fax: +46 31 735 40 00 S-405 15 G�teborg Email: [peter--carlsson--space--se] SWEDEN URL: http://www.space.se