World: r3wp
[I'm new] Ask any question, and a helpful person will try to answer.
older newer | first last |
[unknown: 5] 8-Jan-2009 [1666] | Wanted to say Welcome to Janko and BenBran. |
Janko 8-Jan-2009 [1667] | Thanks Paul |
[unknown: 5] 8-Jan-2009 [1668] | Your welcome. |
Janko 8-Jan-2009 [1669x5] | I have another question about parse, if I may.. I am trying to make a parse block that will uppercase all letters after the . ! or ? . I did it just for dots, but I can't make it for all 3 ( one alternative is to call parse 3 times each time with different separator char. The problem can be observed here, and happens because [ A | B | C ] pattern first looks for A and if it doesn't find a checks B , which means it will skip B if A is after B. Is there any way to say "use any of those chars - *whichever comes first" ? .. example where you can see the problem: |
== true >> parse "A.B!C.D." [ any [ [thru "." | thru "!" ] mark: (print mark ) ] ] B!C.D. D. >> parse "A.B!C.D." [ any [ [thru "!" | thru "." ] mark: (print mark ) ] ] C.D. D. --- in first case it skips the C in second it skips the B .. | |
this is my code to uppercase after scentences : parse X [ ANY [ [ thru "." | thru "!" | thru "?" ] mark: ( uppercase/part trim mark 1 insert mark " " ) :mark ] ] | |
It works if I have just one kind of separator of if I have them in this order for example "a.b.c.d!e?" if I have ""a.b.c.d!e?f. " it will skip the ! ? and produce "a. B. C. D!e?f. " | |
because it will skip over ! and ? to the last "." | |
Steeve 8-Jan-2009 [1674] | hmm... do this parse source [any [["!" | "?" | "."] mark: (do something) | skip ]] |
BenBran 9-Jan-2009 [1675x2] | Thanks Reichart. Yes a good old Irish name (spelling changed when got to America). If Ihave my numbers rights, I'm 4th generation German/Irish/Swedish American and my daughter can add Spain and Mexico to her list. R2 did keep my interest these years. I'm somewhat at a loss to |
oops typo sorry.... | |
Henrik 9-Jan-2009 [1677] | BenBran, you can click the pencil above the text edit field to make multiline messages. |
BenBran 9-Jan-2009 [1678] | Thanks, Thats much better. |
Janko 10-Jan-2009 [1679] | Steeve: I solved it by doing 3 passes , one for each character (.!?) . Performance is not that important here as it's a client , but if it's possible to do it in one pass I would certanly like to learn about it. I will try what you proposed, Thanks! |
Oldes 10-Jan-2009 [1680x2] | str: "a.b.c.d!e?f. " chars: complement charset ".!?" >> parse str [any chars tmp: to end (uppercase tmp)] str == "a.B.C.D!E?F. " |
>> parse str: "assd.asd!d" [any chars tmp: (uppercase tmp)] str == "assd.ASD!D" | |
mhinson 13-Apr-2009 [1682x2] | Hi. I am struggiling to understand parsing & hoping for some pointers. I have read everything I can find but still cant seem to use parsing for basic extraction of information from a number of lines (or even a single line). This is what I am trying to do & would love sme help or links to documentation I may have missed please. lines: {junk wanted line1 contentA rubbish junk notNeeded line2 wanted line three content B rubbish } ;I want to extract ;wanted line1 contentA ;wanted line three content B ;That is to say everything between "wanted" up to "rubbish" but including "wanted" Thanks, /\/\ |
Another (maybe foolish) question please. I am trying to use this script to help me understand the use of parsing to extract data from files. If I paste the script into my REBOL/View console it pastes in the script ok, but the examples do not work. This seems very common with a lot of the scripts in this library and is a problem I have been fighting with for several days. This is what I get. >> ini: parse-ini-file %/c/windows/win.ini ** Script Error: Out of range or past end ** Where: parse-ini-file ** Near: append last current-section parsed-line/1 append >> Am I pasting the script & examples to the wrong type of console or something? I feel it must be something I am doing as so few of the example scripts work for me. Thanks, /\/\ | |
Graham 13-Apr-2009 [1684x2] | You need to provide some rules for what you want and what you consider rubbish. |
there has to be a pattern that you recognize to determine what is what. | |
PeterWood 13-Apr-2009 [1686x2] | >> extract: copy [] == [] >> parse lines [any [["wanted" copy temp to "rubbish" (append extract temp)] | skip ]] == true >> extract == [" line1 contentA " " line three content B "] |
Have you read this - http://www.codeconscious.com/rebol/parse-tutorial.html | |
mhinson 14-Apr-2009 [1688] | Hi, thanks very much for the fast replies. I have read the parse-tutorial and it seems very good for understanding how to create rules that will match patterns, however I only found one brief section that described using "copy" to extract the data from the line, rather than just confirming that a match was found (or not). I tried to use the copy examples but evey time I modified them I ended up with errors as I don't really understand how they work. Peter, thanks for your example, it does almost what I want but the result in 'extract' does not contain the part of the string matched by "wanted". In my simple example I could just append the word "wanted", but in a real world case I would be using a patern match to find the "wanted" key word. I also want to develop the code further to search for a different set of matches if the first set is found, in your example I am unclear where the block is that is performed if the string is found. Thanks very much for your help. /\/\ |
Geomol 14-Apr-2009 [1689] | There's a bit about COPY in PARSE here: http://www.rebol.com/docs/core23/rebolcore-15.html#section-7.3 |
Pekr 14-Apr-2009 [1690] | mhinson - dunno if somebody already replied to you, but 'copy works quite fine. The trouble is, when you change parsed string in paren. You have to put markers there, and return to correct position ... |
PeterWood 14-Apr-2009 [1691x3] | Mike: A small change will include wanted: >> extract: copy [] == [] >> parse lines [any [[copy temp ["wanted" to "rubbish"] (append extract temp)] | skip ]] == true >> extract == ["wanted line1 contentA " "wanted line three content B "] |
The code that is executed in a parse rule is enclosed in parentheses (). So the parse rule that finds wanted.... is copy temp ["wanted" to "rubbish"] (append extract temp) The copy copies the part of the input that matches from the start of "wanted" to the start of "rubbish". Then the Rebol code (append extract temp) is executed. (I would normally write the Rebol as - insert tail extract temp - as it is faster than append in Rebol 2.) | |
You can also insert Rebol code at the start of the parse rules to perform intialisaton parse lines [(extract: copy []) any [[copy temp ["wanted" to "rubbish"] (insert tail extract temp)] | skip ]] | |
sqlab 14-Apr-2009 [1694x2] | another solution >> rule: [(wanted: copy [] ) any [to "wanted" copy line to "rubbish" (append wanted line)] to end] |
better rule: [(wanted: copy [] ) any [to "wanted" copy line to "rubbish" (append wanted line) skip ] to end] | |
mhinson 14-Apr-2009 [1696x2] | Thanks very much Pater & sqlab. those examples both do exactly what I was thinking. I now need to try & understand how this relates to the parse-tutorial & hopefully I will be able to start using the principles myself. Thanks again. |
Hi again. Sorry to be asking questions again so soon. I started using the syntax suggested with success, but in my input file I find the first key word is only valid if it is right at the start of the line. I have been searching through the documentation for the last hour & failed to find any references to "start of line" or similar. (like ^ in reg expressions). I wondered if there was any document to help people convert from regular expressions to Rebol parse expressions too please? Thanks, /\/\ | |
Pekr 14-Apr-2009 [1698x3] | Regexp is quite different beast, and there are no single rules for translation to REBOL's parse. However - what do you mean by the beginning of the line? Is it the first char right after the end-of-line? |
btw - do you use parse/all? I prefer to use parse with the refinement, because using plain 'parse ignores whitespaces, and I don't like when the engine messes with things instead of me :-) | |
Could you please post few lines of your input file? | |
sqlab 14-Apr-2009 [1701] | thry this rule: [(wanted: copy [] ) any [copy line ["wanted" to "rubbish" ] (append wanted line) | thru newline] ] |
mhinson 14-Apr-2009 [1702] | Hi, Pekr, I appreciate that the concept for parsing is different to the use of regular expressions, but there are some things that do map from one to the other & I wondered if any table of those things existed. As a noob sometimes the hardest questions to get answered are the ones where the answer is that there is no concept such as that sought by the noob. e.g. how do you grow strawberries in the sea? The first match must be at the begining of the line. If it was the first line in the set then it would not be after a new line, but other cases it would be. I will use parse/all from now, I like the extra control you describe. here a few lines of a test input, the script I am hoping to develop is to parse the config files from Cisco devices in order to extract the layer 2 & 3 information together with the interface names & descriptions. lines: {interface FastEthernet0 description The connection to the printer ! interface FastEthernet1 ! interface Vlan1 description User vlan (only 1 vlan allowed) no ip address ! interface Dialer0 description Outside ip address negotiated ! interface BVI1 description Inside ip address 192.168.0.1 255.255.255.0 ! ip sla 3 icmp-echo 217.0.0.1 source-interface Dialer0 ip route 0.0.0.0 0.0.0.0 Dialer0 interface ATM0.1 point-to-point no ip redirects no snmp trap link-status pvc 0/38 pppoe-client dial-pool-number 1 ! } ; sqlab, your change to use "thru newline" does what I wanted in this case which is good. ; my next step is to try & understand the "or" construct properly as the code below dosn't quite cut it. wanted: copy [] interface: ["interface" [to #"^/" | to "point-to-point"]] parse lines [any [[copy temp interface (insert tail wanted temp)] | thru newline ]] foreach line wanted [print line] ; thanks very much for your help, /\/\ |
Pekr 14-Apr-2009 [1703x2] | I am far from parse guru, but above rule (while works) looks weird :-) Why to produce interface rule that way? The line is ending with line terminator anyway, no? parse/all lines [ any [ [ "interface" copy int-name to newline (print int-name) newline | skip ] ] ] |
... this is really simpler, no subrule to ruin your brain is needed ... | |
sqlab 14-Apr-2009 [1705] | I am not sure that I understand your intention. Do you want just interface ATM0.1, then you have to switch the order of your interface rule, as the condition to #"^/" (newline) is already true and done, and your cursor behind "point-to-point". As the first part is true, the second will never be done. |
Pekr 14-Apr-2009 [1706x2] | should point-to-point be filtered out? Then the rule would be a bit different .. |
Slightly different version: wanted: copy [] spacer: charset " ^/" name-char: complement spacer interface: [ "interface " copy int-name some name-char (append wanted int-name) spacer ] parse/all lines [any [interface | skip]] print mold wanted | |
mhinson 14-Apr-2009 [1708] | yes, point-to-point needs to be ignored from the result, an other similar cases in real life. once the interface string & details are found the script will need a sub search that is looking for "description" or "ip address" I was hoping that by extracting the rule used for each search i would make it easier to add new rules as the requirement becomes clear. I tried swapping the order in the rule to interface: ["interface" [to "point-to-point" | to #"^/"]] but this just finds everything in the whole input. Perhaps I am to old to learn this. I worked programming in Pascal a good few years ago, but only for about a year. I failed to grasp SmallTalk more recently & I am really struggling with this. Thanks fpr all your helps. /\/\ |
Pekr 14-Apr-2009 [1709x2] | to [ aaaa | bbbb] is long time parse enhancement request, which is not yet implemented, but is planned for 3.0. It would really make lifes of parse beginners much easier. Your parse rule simply means - try to find "point-to-point" or the end of the line. But - it looks for the point-to-point till it reaches end of the input string. |
mhinson - just don't give up ... if you are beginner with REBOL, you choosed to start with pretty advanced topic. | |
Henrik 14-Apr-2009 [1711] | yes, parsing is one of the most difficult topics of REBOL. |
mhinson 14-Apr-2009 [1712] | Thanks for the encouragement.. I wont give up yet for a good while. Most of the programming I have done is out of a need to produce a specific result & that quite often needs to be fairly complex, however having a real need also makes the effort seem more worth while. I appreciate that parsing is quite hard, but it also seems to be one of the features that differentiates REBOL from other languages & is often refered to as being more efficent once the concepts are fully grasped. If this is not true, then perhaps I would be better off with php or perl etc. I have also already had some fun with the very straight forward graphical stuff which is fantastic. I am off out now, I hope to make a bit more code work tommrow as I am on holiday this week. :-) Thanks again |
Pekr 14-Apr-2009 [1713x3] | you can also use rebol and call php or perl for some stuff :-) However - you rules could be made - you just need to scatter it into sections and find some rules for the parsed file structure. |
spacer: charset " ^/" name-char: complement spacer interface: [ "interface " copy int-text some name-char (print ["interface: " int-text]) (append wanted int-text) thru newline ] description: [ "description " copy desc-text to newline (print ["description: " desc-text]) newline ] ip-address: [ ["ip address " copy add-text to newline (print ["ip address: " add-text]) newline | "no ip address" newline (print ["ip address:" "no adress"]) ] ] int-section: [interface any [description | ip-address | "!" break | skip]] parse/all lines [any [int-section | skip]] | |
... ignore (append wanted inte-text) above - I did not use it in the code, I just used print to check how sections work ... | |
older newer | first last |