r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[I'm new] Ask any question, and a helpful person will try to answer.

mhinson
20-Jun-2009
[3046]
Looks like the instructions are written for someone who already knows 
how to use it...
- run the script  ## this dosn't seem to do anything

- Run the parse-analysis.r script and use the tokenise-parse function 
to get the base data.  ## dont understand what this means, tried 
a few things but they all give errors.

The example works, but I cant even see the parse expressions in it 
so I dont understand why it works or how to adapt it for my own example.

When I first looked at this in April I got quite frustrated because 
it looked as if it was there to help newbies learn about parse, but 
it was too hard for newbies to understand how to use... now I can 
at least understand how to run the example.  Thanks
Sunanda
20-Jun-2009
[3047]
I sympathise....the documented examples for parse-analysis are certainly 
less than clear on what steps you need to take to prep a parse for 
analysis.


If you have worked it out for some simple examples, then adding a 
discussion thread to the script may help other's in future.
mhinson
20-Jun-2009
[3048]
I will work it out, I am determined.  Thanks for your encouragement.
Gregg
21-Jun-2009
[3049x3]
I often have trouble visualizing how things work, and I don't feel 
that I really understand something until I can do that.  With PARSE, 
even though it can be tedious and create volumes of output, it may 
help to write some simple grammars and have it print output for every 
rule. Have a probe at the start of the rule, and a probe if it's 
successful, nesting the output helps a lot too.


Don't get discouraged, it's not an easy thing to grok for a lot of 
people.
In my largest grammar, where incoming data may be malformed, I've 
found it invaluable to have the rule tracing built in, enabled by 
a flag. e.g. 

    TSAFE-CHAR: [
        (rule-trace "TSAFE-CHAR IN")
        copy =TSAFE-CHAR charset-21
        | charset-22
        | charset-23
        | charset-24
        | charset-25
        | NON-US-ASCII
        (rule-trace "TSAFE-CHAR OUT")
    ]

    rule-trace: func [value /local rule-name action] [
        rule-name: first parse value none

        ;print [tab rule-name tab found? find don't-trace rule-name]
        action: second parse value none
        if all [
            any [
                trace-rules? = true
                action = form trace-rules?
            ]
            not found? find don't-trace rule-name
        ][
            val: attempt [mold get to word! join "=" rule-name]
            print ["===" value  any [val ""]]
        ]
    ]


Don't-trace allows you to turn off selected rules that may get called 
a lot. You could also set tracing levels if you wanted.
This makes it easy to test input that doesn't parse and, if I can't 
find the problem or location quickly, turn on the trace and see exactly 
where it fails. 


I've also thought about, and asked Carl for, a way to get the position 
where the parse failed, including a line number.
mhinson
21-Jun-2009
[3052]
Thanks for the extra comments. was working all last night & with 
my daughter all day so not found enough nureons to study this yet. 
 Thanks.
Izkata
21-Jun-2009
[3053]
The main thing I do is, at some point that happens a lot in the code, 
display the data.  Makes it easier to step through the code and do 
- in your head - what the code is doing.  If it suddenly doesn't 
match up, the loic somewhere was wrong.  For example, when working 
on the last one, I had to debug part of it and did this:

parse/all {aX--baX~~a--aX~~aXX~~} [                              
          

   some [                                                           
             

      ["a" S: (? S) some [E: ["--" | "~~"] (print copy/part S E) break 
      | skip] | skip] 
   ]
]
mhinson
23-Jun-2009
[3054x3]
I seem to be going backwards with learning this. Perhaps I think 
I know the grammer, but don't.

I am trying to write out long hand what I want to do, then convert 
it to a parse, but I dont have any words to describe how parse skips 
back to a previous point, so I cant write what I want to do long 
hand either..  e.g. gather the x in pairs from {fd doixx s x x x 
oie    x }

test for "x" or skip  then when x is found do the same thing but 
escape to the outside loop.   

If I could write the above in a format that made more sense I think 
I would have a better chance of then converting it to a parse.

test for "x" or skip seems to be   ["x" | skip]

doing it twice to get the x in pairs would seem like  ["x" | skip] 
["x" | skip] 

but that dosnt work because it lacks the loop control so I add that 
by a bit of guess work because I dont understand it properly.
parse/all data [some[["x" | skip] ["x" | skip]]]

but this is just completly wrong & I think the reason it is wrong 
is because I have completely misunderstood some or skip or | or where 
the point in the string the parse pointer gets to after each step.... 
 I have tried using OPT & break in the second section but just get 
more muddled.
Gregg's suggestion sounds as if it might be helpfull, but I cant 
understand it yet.  Izkata's suggestion is very helpfull, but in 
my case tends to just show that I am miles away from anything like 
a solution.  Thanks.
Often there are ways to work round these needs to be more efficent 
with Parse (like editing the data to be parsed before you start), 
but it would be nice to understand how the loops work enough to use 
them & start developing more resiliant parse rules.
PeterWood
23-Jun-2009
[3057x15]
MIke, my first advice would be to avoid trying to "go back" during 
a parse operation at this stage. I think that is something to leave 
until you feel more comfortable with parse.
If possible, it would probably better to forget the concept of loops 
when thinking about parse too.
Perhaps a good starting point is to think that when you "parse" something, 
you are asking does it conform to the rules I have supplied. If it 
does parse returns true, if it doesn't parse returns false.

>> parse "abcdefghi" ["abcdefghi"]

== true

>> parse "abcdefghi" ["abcde"]    

== false
You're already familiar with to,thru,  end and to end: 

>> parse "abcdefghi" ["a" to "i"]
 
== false
>> parse "abcdefghi" ["a" thru "i"]
 
== true
>> parse "abcdefghi" ["abcde" to end]

== true
Perhaps the second thing to realise is a rule can be split into "sub-rules':

>> parse "abcdefghi" ["abcde" "fghi"]

== true
Each "subrule" can have some Rebol code executed if it is "triggerred":


>> parse "abcdefghi" ["abcde" (print "abcde found") "fghi"(print 
"fghi found")]

abcde found

fghi found

== true
>> parse "abcdefghi" ["abcde" (print "abcde found") "xyz"(print "xyz 
found")] 
 
abcde found

== false
(What Gregg referred to was a much more sophisticated way of getting 
the parse to show you what happened).
Sub-rules can be optional by using the | (or) and enclosing the options 
in a block:


>> parse "abcdefghi" ["abcde" (print "abcde found") ["fghi" (print 
"fghi found") | "xyz"(print "xyz found")]]                       
                                                   
abcde found
fghi found
== true
Sorry about the formatting, It's one of the big problems with AltME 
under Mac OS X.
Skip tells parse to move to the next item:

>> parse "abcdefghi" ["abcdefgh" skip]
== true   ;; because the skip took us to the end
We can specify that a sub-rule should be repeated:


>> parse "abcdefghi" ["abcde" 4 skip] 

== true
A better example of repetition:

>> parse "aaa" [4 "a"]

== false
Some and Any are forms of repetition, this shows the difference:

>> parse "aaa1000" [some "a" to end]

== true

>> parse "aaa1000" [any "a" to end] 

== true

>> parse "bbb1000" [any "a" to end]   

== true

>> parse "bbb1000" [some "a" to end]

== false
I apologise if you are already happy with these basic concepts, in 
which case I hope you don't mind the refresher.
sqlab
23-Jun-2009
[3072x3]
Maybe these are some variations of what you are looking for


parse/all "fd doixx s x x x oie    x } " [some [copy d   "x" (print 
d) | skip]]


parse/all "fd doixx s x x x oie    x } " [some [copy d 1 2  "x" (print 
d) | skip]]


parse/all "fd doixx s x x x oie    x } " [some [copy d  2  "x" (print 
d) | skip]]


parse/all "fd doixx s x x x oie    x } " [some [copy d   "xx" (print 
d) | skip]]


parse/all "fd doixx s x x x oie    x } " [some [[copy d  "x"  copy 
e  "x" (print [e d]) ] | skip]]


parse/all "fd doixx s x x x oie    x } " [some [ (g: copy "" ) 2 
[copy d  "x"  (append g d)  ]  (print g )  | skip]]
or you are looking for the pairs

 parse/all "fd doixx s x x x oie    x } "  [ some [  [ (g: copy "" 
 ) 2 [ copy d "x"  (append g d ) any notx  | skip  ] (if not empty? 
 g [print g]) ]  ] ]
I forgot notx

notx: complement charset "x"

parse/all "fd doixx s x x x oie    x } "  [ some   [ (g: copy "" 
) 2 [ copy d "x"  (append g d ) any notx  | skip  ] (if not empty? 
g [print g]) ]  ]
mhinson
23-Jun-2009
[3075x2]
Thanks PeterWood. I like to think I am ok with the most basic concepts, 
so now I am trying to learn things that will help me some my real 
life probelms in a better way.  I use parse pretty much every day 
& always have a rebol console up on my work PC, but ANY SOME & OPT 
& |  I do not understand in context.  I understand them in abstract 
terms, but not how to apply them in conjuction with [] . I do understand 
your examples of some & any (these examples are usefull to me). skipping 
an un-known number of chars to get to the next match is the bit I 
find hard to understand how to construct, paticularly if it needs 
to be done in the context of a previous match.
sqlab, I dont know about this syntax at all. I dont think I understand 
what is happening here.

copy to "x"  & copy thru "x"  I understand, but copy "x"  I didn't 
expect to see.
BrianH
23-Jun-2009
[3077]
In the parens you use the COPY function, not the PARSE copy operation. 
Is that what you meant?
mhinson
23-Jun-2009
[3078x3]
The compliment syntax & the    to 1 3 digit   where digit is a charset 
seems to be "unreliable" as far as I can understand.
this is what I dont expect.

parse/all "fd doixx s x x x oie    x } " [some [copy d   "x" (print 
d) | skip]]
I dont think I have ever seen the PARSE copy operation documented. 
  I will have  a hunt for it.
Maxim
23-Jun-2009
[3081]
have you ever read the parse documentation in the old RT publisehd 
rebol 2.3 pdf  ?  its a good reference... there are only minor changes 
from that version up to the latest... I don't think any of the examples 
would fail in the current parse.
mhinson
23-Jun-2009
[3082]
chpter 15 or the Rebol Core Manual  http://www.rebol.com/docs/core23/rebolcore-15.html
 may have a use of this syntax in a complicated example, but no description 
of what is happening exactly.
Maxim
23-Jun-2009
[3083]
yep, that's the online version of it.
BrianH
23-Jun-2009
[3084]
OK, here's what happens: The next recognized pattern is COPY/part'ed 
and assigned to the variable. If the length of the matched pattern 
is 0, #[none] is assigned to the variable.
mhinson
23-Jun-2009
[3085x2]
Yes, I have read it a lot, but it seems more of a reference for people 
who already know, rather then an explanation of Parse operations.
Thanks BrianH, I was sort of guessing it must be like a variation 
of copy thru "x" that does not skip like thru...  I think I get that 
now. Thanks.
BrianH
23-Jun-2009
[3087x2]
Note that the assignement to the variable happens *after* the pattern 
is recognized, so any code inside the pattern that references the 
value of the variable will get the old value. Like this:
>> x: "old"
== "old"
>> parse "new" [copy x ["new" (print x)] (print x)]
old
new
== true
The same goes for the set operation of block parsing.
mhinson
23-Jun-2009
[3089]
That is pretty important! I had not realised that before & this copuld 
account for some of the unpredictable behaviour I get.. I thought 
the patern was complete at your first print statement.   These [] 
have lots of subtle influence.
BrianH
23-Jun-2009
[3090]
[ and ] are a grouping construct.
mhinson
23-Jun-2009
[3091]
This is my nemisis. I can't understand how this prints XXXX then 
XX , not XX  three times.  It seems to have a will of its own.

parse/all { X X  XX X X} [some[[copy x "X" (prin x) [copy y "X" (print 
y) | skip] | skip]]]    
I have been stuck on this (in various forms) for over a week now
BrianH
23-Jun-2009
[3092]
Well first of all, you have an extra [ ] in there, just after the 
some.
mhinson
23-Jun-2009
[3093x2]
My thinking is that I expect the inner copy to be executed after 
the first "X" is found, then come back out of the inner bit when 
the next "X" is found.
oops.. my bad with the extra [ ]    I keep trying all sorts & that 
got left behind.
BrianH
23-Jun-2009
[3095]
parse/all { X X  XX X X} [some [copy x "X" (prin x) [copy y "X" (print 
y) | skip] | skip]]

Character at a time:
- the outer skip
- copy x "X" (prin x)
- the inner skip
- copy x "X" (prin x)
- the inner skip
- the outer skip
- copy x "X" (prin x)
- copy y "X" (print y)
- the outer skip
- copy x "X" (prin x)
- the inner skip
- copy x "X" (prin x)
- the outer skip

Try this:

>> parse/all { X X  XX X X} [some [copy x "X" (prin x) [copy y "X" 
(print y) | skip (prin "i")] | skip (prin "o")]]
oXiXioXX
oXiXo== true