World: r3wp
[I'm new] Ask any question, and a helpful person will try to answer.
older newer | first last |
Maxim 3-May-2009 [2074x2] | the problem is that the above can be extremely slow... just like regexp :-) |
be back later ... dont forget the assignment ;-) | |
mhinson 3-May-2009 [2076x3] | Thanks again Maxim |
Sunanda, I had a look at the parse visualiser yesterday, it looks a bit advanced for me yet. A simple version would be good for newbies. I got it to work on his examples, but my examples produced no output. I expect I was doing something foolish. I will return to it when my basic skills are a bit better. | |
Maxim. Here is my first attempt at the homework you set me. This builds on what you showed me & also relates to the example given to me by Pekr. data: "before first tag <TAG> after 1st pointy tag [TAG] after square tag <TAG> after pointy tag 2" tag-square: "[TAG]" tag-pointy: "<TAG>" output: func [tag here] [ print rejoin ["we are passed the " tag " : '" here "'"] ] parse/all data [ some [ [tag-pointy here: (output tag-pointy here) ] | [tag-square here: (output tag-square here) ] | skip ] ] I thought it would make the action clearer if the output was in a function & the keys used variables. | |
Ladislav 3-May-2009 [2079x2] | mhinson: your rule: b: [to "bb" break] looks quite dangerous. TO means a lot of input may be skipped, which is usually not what you want. Moreover, BREAK in that rule is not the right place. (it just breaks the rule, but that is totally unnecessary. |
(I am not a big fan of BREAK myself, every rule can be written without BREAK) | |
mhinson 3-May-2009 [2081] | Ladislav, thanks for your comments. It has been suggested that I should avoid TO in any backtracking parse until I know much more about what I am doing. |
Ladislav 3-May-2009 [2082x4] | yes, in your rules you certainly wanted to look for more alternatives, than just for "bb". In that case the usage of To is not advisable. |
(you have to tell Parse what the alternatives are *before* using any cycle construct like TO, SOME, etc.) | |
so, my guess is, that you wanted something like b: "bb" y: "yy" parse input [any [b | y | skip]] or some such, the above would find all occurrences of the parts you specified | |
if you want to find just the first occurrence, then you may use e.g.: occurrence: [b | y] parse input [any [occurrence break | skip]...] | |
Pekr 3-May-2009 [2086] | mhinson: there is simple rule to how to read TO: skip everything, unless you find the target. This is not what you wanted in your original example, because e.g. TO "b" might also mean, that "b" might be the last char of your string. So if you are looking for FIRST occurance of [a | b | c], you really have to forget TO and use skip based parsing by one char. Hence some [a break | b break | c break | skip] is your friend ... |
Ladislav 3-May-2009 [2087] | To [a | b | c] may work in the future, but it certainly does not work now (although it is not hard to replace using ANY or SOME) |
mhinson 3-May-2009 [2088] | I am trying to formulate an example that shows why I thought TO was useful. It mostly has to do with where I want to extract the data complete with the key I used to find it. Without using TO it seems that I need to add the string I was looking for back onto the data I have extracted. |
Ladislav 3-May-2009 [2089x2] | did you have a look at the link Sunanda mentioned? http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse? |
aha, you want something like the AT command. That is easy. [here: "string" :here] | |
mhinson 3-May-2009 [2091x3] | Yes, that was the firt time I found that as I had not realised the Wiki was so extensive. It is a good source. |
for this example, what would be the most simple version that returns exactly "B2." without using TO? parse "A1.B2.C3." [to "B" copy result thru "."] print result | |
assuming the only reference points are "B" and "." | |
Ladislav 3-May-2009 [2094x2] | b: [here: "B" :here copy result thru "."] parse input [any [b break | skip]] |
you may need some other tricks, like how to make the rule fail, if no occurrence was found | |
mhinson 3-May-2009 [2096] | I could not reply. AltME was broken. I dont understand the syntax of the :here statement. |
Pekr 3-May-2009 [2097x3] | it's easy - there are so called parse markers. Imagine parse working on strings (blocks) at parse level. But then there is underlying level, the string (block) itself. You can access the string from the parse level either from parens, or by setting markers. |
so what code from Ladislav does is: here: ; mark the position in the input string. Something like AT. Then you can use it in your parens B ; it matches "B". :here ; return back to saved position. So parse input string AT position moves back behind the "B" | |
You can also think about it like: start: some-matching rules end: (copy/part start end) | |
mhinson 3-May-2009 [2100] | So is :here part of the parse dialect? I undedrstand here: but not :here |
Pekr 3-May-2009 [2101] | yes |
mhinson 3-May-2009 [2102] | Should I understand it to move the parser "cursor" back to the position of index? here ? |
Pekr 3-May-2009 [2103x3] | Look at my explanation above, and try to understand it. After here:, there is going to be "B" matched. So it means, that index is moved past "B". But you want to have your string copied including "B". So by issuing :here, you put parser to the saved position. |
exactly ... | |
... you can have many such named markers ... | |
mhinson 3-May-2009 [2106x3] | I can apreciate the usefullness of that. |
I am not sure why the BREAK is needed in the example from Ladislav above. Is it to force the rule to return true when the "B" and "." matches are found to prevent it carrying on looking for a second match further down the string? | |
Testing that idea out looks as if I have stumbled on the right answer. Maybe there is hope for me yet. | |
Pekr 4-May-2009 [2109] | 'break is needed in Ladislav's code imo because after first match of "B" you want to escape (break from) repetitive 'any block, and continue your processing with furhter rules (which is not the case with Ladislav's example, but is the case with your example, where 'copy followed. If there would be no break, after matching "B", the rule would still succeed, because if there is no "B", then there is always "skip option, which is always valid until the end of the script. So actually without the 'break, this 'any block would 'skip till the end of input string is reached ... |
Ladislav 4-May-2009 [2110] | re Break: it is used to make sure the "B"..."." is processed just once. If you need to process many such parts, then don't use Break |
mhinson 4-May-2009 [2111] | I have been working out ways to extract IP addresses from a string today. Is this a good way to do it? What could catch me out? parse to-block "junk 111.111.111.111 0.0.0.0 255.255.255.128 junk" [ any [ set tup tuple! (print tup) | skip ] ] |
Oldes 4-May-2009 [2112] | It depends, what the junk can be.. in your case it must be REBOL loadable. |
mhinson 4-May-2009 [2113] | I was hoping the TO-BLOCK would take care of that. do I need to parse the junk first to remove unloadable strings? or is there another TO- function that will do it for me please? |
Oldes 4-May-2009 [2114] | This should be safe: use [ ch_numbers ch_rest rl_ip ip-start ip-end ips ][ ch_numbers: charset "0123456789" ch_rest: complement ch_numbers ips: copy [] rl_ip: [ ip-start: some ch_numbers #"." some ch_numbers #"." some ch_numbers #"." some ch_numbers ip-end: (error? try [append ips to-tuple copy/part ip-start ip-end]) ] set 'get-ips func [str][ clear ips parse/all str [ some [ any ch_rest rl_ip ] ] ips ] ] get-ips "err,.;s 111.111.111.111 0.0.0.0 255.255.255.128 junk" |
mhinson 4-May-2009 [2115] | Thanks or that snippet, sounds as if you have been on this trail before. Thanks. |
Oldes 4-May-2009 [2116] | sorry.. : some [ any ch_rest rl_ip | skip ] so it handles cases like: get-ips "err,.;s 111.111.111 0.0.0.0 255.255.255.128 junk" |
mhinson 4-May-2009 [2117] | clever, defo better than my simple tuple search. Thanks. |
Oldes 4-May-2009 [2118x4] | and or you can use: get-ips2: func[str /local ips ip][ str: parse str none ips: copy [] while [not tail? str] [ error? try [ ip: to-tuple str/1 if 4 = length? ip [append ips ip] ] str: next str ] ips ] |
BUT in the second version there is problem with cases like: get-ips2 {"this is invisible IP 0.0.0.0" 255.0.0.0} | |
Also it fails with: get-ips2 {this is NOT an IP 255.0.0.} | |
So the result is.. if you want to be sure, use the string based parsing. | |
mhinson 4-May-2009 [2122] | I suppose it depends where the data comes from. looking at configs from routers should mean the IP addresses & masks etc are already propperly formatted. Thanks. |
Pekr 5-May-2009 [2123] | you could also probably enhance it by stati 1 to 3 digitis, dot ... |
older newer | first last |