World: r3wp
[I'm new] Ask any question, and a helpful person will try to answer.
older newer | first last |
Henrik 17-Apr-2009 [1753x4] | but unlike other languages that have to use regexp as a crutch for parsing, REBOL can use other methods than PARSE to process dialects. PARSE is useful in some cases and in other cases, there are better/simpler approaches for processing your dialect. |
You are in dialect territory immediately when you are defining a block of data and wanting to do something other than evaluating it with DO. | |
Want to create a dialect using only integers? Fine: >> total: 0 == 0 >> parse [1 2 3 4] [4 [set num integer! (total: total + num)]] == true >> total == 10 | |
But in another place, [1 2 3 4] might mean something entirely different. It also happens to evaluate as normal REBOL code: >> do [1 2 3 4] == 4 But of course it doesn't do much. :-) Code is data and data is code. | |
mhinson 17-Apr-2009 [1757] | Perhaps I should go back to trying to form a program specification & see if the advice I get in that context is different. If I have print "hello world" that seems to follow syntax rules shown by "source print" are you saying because I could have >> hi: [print "hello world"] == [print "hello world"] >> do hi hello world I have started using a dialect? |
Henrik 17-Apr-2009 [1758x2] | No, when using DO, it will not be a dialect, just normal REBOL code. Before you do anything with it, the block is just a chunk of data. A dialect involves some kind of processor that you write or exists in REBOL already, which you then apply to the chunk of data, but is not the base scanner (the main language parser). |
An example where DO wouldn't work would be VID (the graphics user interface system, Visual Interface Dialect) do [button "Hello world!"] ; gives an error layout [button "Hello world!"] ; returns a meaningful result, because the block was parsed as a dialect. | |
mhinson 17-Apr-2009 [1760x2] | Sorry, I am not getting it at all. |
what you seem to be showing me look like what I would call functions or procedures. I cant understand the destinction yet. | |
sqlab 17-Apr-2009 [1762x2] | To make it short It is like a new language in your programming language. It can be just an add-on to your normal language with enhancement or a totally different language |
So parse has it's own language, that sometimes resembles to rebol, sometimes is different. And you can make your own languages or dialects too | |
mhinson 17-Apr-2009 [1764x2] | it sounds like a very flexiable concept, but likely to add complexity. parse seems to be well documented in terms of how a string can be split apart in this manner >> probe parse {Hello world} none ["Hello" "world"] but much less documented when trying to do complex stuff... I was in my ignorance expecting it to follow some sorts of syntax rules that I could read about. Have I missed a basic concept? |
I am only thinking about finding & extracting data here, not parsing for commas or html tags etc. | |
sqlab 17-Apr-2009 [1766] | http://www.rebol.com/docs/core23/rebolcore-15.html |
Henrik 17-Apr-2009 [1767] | I know the following sounds basic, but it's _crucial_ to understanding how REBOL works, otherwise you will not "get" REBOL: You must understand the concept that data is code and code is data. That means that anything you write, can be considered pure data. It's what you do with that data that becomes important. It's like speaking a sentence, but only paying attention to the placement of letters. That makes the sentence pure data. If you then pay attention to the words objectively, they can form a sentence, so you can validate its syntax. If you use the sentence in a context, you can apply meaning to it, subjectively. If you switch the context, the sentence can mean something entirely different. This is very important. Context and meaning. For REBOL: [I have a cat] This is a block with 4 words. It's pure data that can be stored in memory, but at that level it doesn't make any sense to REBOL. If you then apply a function to that data, you can process it. DO processes that data as REBOL code. It will be evaluated as REBOL code. Here it will produce an error, because it's not valid REBOL code. If you produce your own dialect, for example with PARSE, you can make that block make sense. When typing in the console, REBOL evaluates it as normal REBOL code by using DO internally. That means: >> now == 17-Apr-2009/18:13:14+2:00 is the same as: do [now] But this block: [now] is just pure innocent data with no meaning. |
mhinson 17-Apr-2009 [1768] | you are all very kind to spend so much time helping me with this. |
Brock 17-Apr-2009 [1769x3] | 7.4 Marking Input: in the link provided by sqlab explains the use of set-words in the parse dialect. I needed to use this technique to strip out large comments from web logs that I was parsing and passing into a database. I was able to remove the large comment and replace it with a default string indicating "comments have been removed" |
One of the methods I used to parse lines of data was as follows: lines: read/lines %file.txt foreach line lines[ parse line [parse rule here] ] | |
This way you are only dealing with a small amount of data for each parse and might make it easier to visualize for you. | |
Pekr 17-Apr-2009 [1772] | mhinson: now my explanations to some of your questions, as I think not everything was explained to you properly: 1) parse/all - /all refinement means, that string is parsed "as-is", because without the /all, white-space is skipped: >> parse "this white dog" ["this" "white" "dog"] == true >> parse/all "this white dog" ["this" "white" "dog"] == false >> parse/all "this white dog" ["this" " " "white" " " "dog"] == true I prefer to always use /all refinement for string parsing ... 2) i don't understand why there is | before "some", that code will not work imo ... 3) "ifa:" is a marker. Think about parse in following terms ... you have your data, here a string. Parse is the matching engine, which tries to match your input string according to given rules. In parse context (dialect) you have no means of how to manipulate the input string, except the copy. So markers are usually used, when you want to mark some position, then do something in parens, and then get back the position, or simply mark start: .... then somewhere later end: and in the paren (copy/part start end) to copy the text between the two marked positions ... 4) "skips till one of the OR conditions are met" - very well understood ... 5) Here's slight modification for append/only stuff. Type "help append" in the console. /only appends block value as a block. You will understand that, once you will need such behaviour, so far it can look kind of academic to you :-) I put parens there, to make more obvious, what parameters are consumed by what function .... >> wanted: copy [] == [] >> append (append wanted (copy/part "12345" 3)) interf: copy ["abc"] == ["123" "abc"] >> wanted: copy [] == [] >> append/only (append wanted (copy/part "12345" 3)) interf: copy ["abc"] == ["123" ["abc"]] |
mhinson 17-Apr-2009 [1773] | Thanks very much again for so much help, I am very gratefull for the time you have spent helping me with this. A bit of a light is beginning to come on.. so outside of the parse dialect we have this syntax result: copy "hello" but inside parse we have a different syntax for copy Once I realised that I felt much less confused & set about experimenting with to & thru in the context of copy within parse. Perhaps these results will be of interest to other noobs, although I mustt say actualy typing them in helped me appreciate what was happening. parse {ab hello cd} [copy result "a" "o"] ;; returns "a" parse {ab hello cd} [copy result to "a" "o"] ;; returns none parse {ab hello cd} [copy result "a" to "o"] ;; returns "a" parse {ab hello cd} [copy result to "a" to "o"] ;; returns none parse {ab hello cd} [copy result thru "a" "o"] ;; returns "a" parse {ab hello cd} [copy result "a" thru "o"] ;; returns "a" parse {ab hello cd} [copy result thru "a" thru "o"] ;; returns "a" parse {ab hello cd} [copy result "h" "o"] ;; returns [] parse {ab hello cd} [copy result to "h" "o"] ;; returns "ab " parse {ab hello cd} [copy result "h" to "o"] ;; returns [] parse {ab hello cd} [copy result to "h" to "o"] ;; returns "ab " parse {ab hello cd} [copy result thru "h" "o"] ;; returns "ab h" parse {ab hello cd} [copy result "h" thru "o"] ;; returns [] parse {ab hello cd} [copy result thru "h" thru "o"] ;; returns "ab h" parse {ab hello cd} [copy result ["a" "o"]] ;; returns [] parse {ab hello cd} [copy result [to "a" "o"]] ;; returns [] parse {ab hello cd} [copy result ["a" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [to "a" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [thru "a" "o"]] ;; returns [] parse {ab hello cd} [copy result ["a" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result [thru "a" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result ["h" "o"]] ;; returns [] parse {ab hello cd} [copy result [to "h" "o"]] ;; returns [] parse {ab hello cd} [copy result ["h" to "o"]] ;; returns [] parse {ab hello cd} [copy result [to "h" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [thru "h" "o"]] ;; returns [] parse {ab hello cd} [copy result ["h" thru "o"]] ;; returns [] parse {ab hello cd} [copy result [thru "h" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result [thru "a" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [thru "h" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result thru "a" to "o"] ;; returns "a" parse {ab hello cd} [copy result thru "h" to "o"] ;; returns "ab h" parse {ab hello cd} [copy result [to "a" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result [to "h" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result to "a" thru "o"] ;; returns none parse {ab hello cd} [copy result to "h" thru "o"] ;; returns "ab " parse {ab hello cd} ["h" copy result "o"] ;; returns [] parse {ab hello cd} [to "h" copy result "o"] ;; returns [] parse {ab hello cd} ["h" copy result to "o"] ;; returns [] parse {ab hello cd} [to "h" copy result to "o"] ;; returns "hell" parse {ab hello cd} [thru "h" copy result "o"] ;; returns [] parse {ab hello cd} ["h" thru copy result "o"] ;; returns [] parse {ab hello cd} [thru "h" copy result thru "o"] ;; returns "ello" parse {ab hello cd} ["a" copy result "o"] ;; returns [] parse {ab hello cd} [to "a" copy result "o"] ;; returns [] parse {ab hello cd} ["a" copy result to "o"] ;; returns "b hell" parse {ab hello cd} [to "a" copy result to "o"] ;; returns "ab hell" parse {ab hello cd} [thru "a" copy result "o"] ;; returns [] parse {ab hello cd} ["a" copy result thru "o"] ;; returns "b hello" parse {ab hello cd} [thru "a" copy result thru "o"] ;; returns "b hello" parse {ab hello cd} [thru "a" copy result to "o"] ;; returns "b hell" parse {ab hello cd} [thru "h" copy result to "o"] ;; returns "ell" parse {ab hello cd} [to "a" copy result thru "o"] ;; returns "ab hello" parse {ab hello cd} [to "h" copy result thru "o"] ;; returns "hello" |
Oldes 17-Apr-2009 [1774x2] | parse/all {ab hello cd} [3 skip copy result 5 skip to end] ;; returns "hello" parse/all {ab hello cd} [thru #" " copy result to #" " to end] ;; returns "hello" |
(you don't have to use the [to end] in my versions if you don't care that parse returns false instead of true | |
mhinson 18-Apr-2009 [1776] | I have written my first bit of code that is starting to do something useful. All the bad bits are mine & all the good bits are from the help I have been given here. My main intention is to start off with code that I can understand & develop so any criticism would be most welcome. My next step is to remove the debug code & replace it with code that stores all the information in a structured form for searching & further analysis. Thanks for all your help with this. filename: copy %/c/temp/cisco.txt ;; cisco config file host: copy [] interface: copy [] intDescription: copy [] intIpAddress: [] ipRoute: [] IntFlag: false spacer: charset " ^/" name-char: complement spacer lines: read/lines filename foreach line lines [ ;; move through lines parse/all line [copy temp ["interface" to end] ( ;; evaluated if "interface" found preceeded by nothing else interface: copy temp print interface ;; debug code IntFlag: true) | copy temp2 [" desc" to end] ( ;; evaluate if " desc" found preceeded by nothing else if IntFlag [print temp2] ;; debug ) | copy temp3 [" ip address" to end] ( ;; " ip address" print temp3 ;; debug ) | copy temp4 ["hostname" to end] ( ;; "hostname" print temp4 ;; debug ) | copy temp5 [name-char to end] ( ;; any char except space or newline. this must be last ; if IntFlag [print temp5] ;; debug if IntFlag [print "!"] ;; debug IntFlag: false ) ] ] ;###################################### the input file contains these lines which are extracted (except the !) plus it has a load more lines that are ignored at the moment. hostname pig interface Null0 ! interface Loopback58 description SLA changed this ! interface ATM0 ! interface ATM0.1 point-to-point ! interface FastEthernet0 description my first port ! interface FastEthernet1 description test1 ! interface FastEthernet2 description test2 ! interface FastEthernet3 ! interface Dot11Radio0 ! interface Vlan1 description User vlan (only 1 vlan allowed) ! interface Dialer0 description $FW_OUTSIDE$ ip address negotiated ! interface BVI1 description $FW_INSIDE$ ip address 192.168.0.1 255.255.255.0 ! !########### end ########## |
Steeve 18-Apr-2009 [1777] | What a waste... Are you sure you understand well the idea behind parsing ? It's not specific to rebol, Parsing exists in many computer langages, At first you have to understand the theory behind... If not, you will just produce trash code.like that |
[unknown: 5] 19-Apr-2009 [1778] | mhinson, don't be discouraged by Steeve's lack of politeness. I assure you that we are not all this way. Just be sure to ask questions. |
Pekr 19-Apr-2009 [1779] | Steeve - why a waste? REBOL's parse allows even lamers like me to produce the result, which in the end does what I want it to do, but you surely would not like to see my parse rules :-) I can't write single piece of regexp, yet REBOL's parse is usefull to me. |
Henrik 19-Apr-2009 [1780x2] | it would be interesting if the config file could be loaded, by making unloadable parts like $FW_OUTSIDE$ loadable using simple string replacable. Then you could just 'load the file into a block and it would be considerably easier to parse. |
string replacable = string replacement | |
Steeve 19-Apr-2009 [1782] | Pekr, I'm just disapointed by what mhinson produced after getting so many advices from Rebolers like you. The read/line trick is very useless, why doesn't he use the standard way of traversing newlines with parse ? And why using code (inside parents) to manage optional rules. Are commands like SOME, ANY, OPT not enough to manage simple rules like that ? |
Henrik 19-Apr-2009 [1783] | I'm just admiring that mhinson wasn't scared of jumping into PARSE so soon. :-) |
mhinson 19-Apr-2009 [1784] | Thanks for the feed back, all is most welcome. I will try to avoid read/line if it is bad, is there a list of things I can't expect to load? Should I convert them to some symbolic value & then convert them back again for the final output? I don't yet understand why a block would be easier to parse than lines, by easier do you mean more efficient or easier to create the code? The optional rules (inside parents) are to change the behavior based on lines read previously so I don't yet understand any concept that would let me avoid those. I need the code to be very simple (like me) so I can understand how it is operating. I know my implementation goes against the Rebol ethos of small & efficient but perhaps in time I can understand enough to make it so & also start using relative expressions properly so it can be simple to understand. |
Henrik 19-Apr-2009 [1785] | I don't yet understand why a block would be easier to parse than lines, by easier do you mean more efficient or easier to create the code? Yes, it's easier, because REBOL is based around this concept. Without this concept, dialects wouldn't make much sense. Your configuration file shown above is a good candidate for a dialect with some tweaks. I suggest, you read again what I wrote above about the basics of words, context and meaning. I can't emphasize enough how important this is to REBOL and especially for block parsing. It's important to understand this before moving on, or REBOL will remain difficult to use for you. Or drop your parse project for now and stick with the basics until you understand more of REBOL. is there a list of things I can't expect to load? The LOAD function will error out, if a string (such as a file you read) can't be loaded into a block of data. Try these in the console, and see which ones work: load "hello" load "hello," load "hello." load "%" load "1 2 3 4" load "hostname pig interface Null0 ! interface Loopback58 description SLA changed this" load %/c/temp/cisco.txt |
sqlab 19-Apr-2009 [1786x2] | I am against loading this configuation files. Why? -you can not control what is inside we know already, that there are elements unloadable by Rebol and the description almost always needs string parsing What would I change? I would either use only one temp variable in the parse rule and after just set to the new variable, as there is already a copy involved or I would use a meaningfull variable name in the first place |
Another reason against loading - we can not determine, if "interface" is at the start of a line | |
mhinson 19-Apr-2009 [1788] | Good point about my temp1 temp2 etc. that was sloppy. It is true I cannot control what is inside the config files. They can contain any printable chars (eg in encrypted password fields or remarks/descriptions or embedded TCL code) and sometimes I am going to want to capture that text. I don't mind trying to do both methods as it will help me learn. Since I can't load the file directly I am thinking I will need to do a read %file.txt & replace the /\,[]%$()@:? with %xx etc. then load the result? I cant find a list of all the chars that I would need to treat like this yet. Henrik, I do continue to read what you have written, it is helpful & I think I am beginning to appreciate the concepts. I am probably not as clear as I should be about the specification of what I am trying to do so the code has tended towards listing the requirements rather than being elegant. Thanks /\/\ |
sqlab 19-Apr-2009 [1789] | further enhancements I would try to extract the rule parts into single rule e.g. interfacerule: [--] descriptionrule: [--] etc. (names are debatable.) better than [copy .. [" ip address" to end]] is probably [to "ip address" copy temp to end], unless you know, that there is always one space |
Sunanda 19-Apr-2009 [1790] | 'read/lines is not bad. It enables you to easily split the problem inyto smaller phases. It makes it harder to solve the problem in one huge 'parse. But maybe one big 'parse is not the best approach -- especially if you need (say)backtracking for error recovery, or line-numbers of points of failure. You are getting several people's views on how to tackle your problem here. Take their advice seriously, but remember you are the domain specialist, so you get to choose which solution fits best. Not us :-) |
mhinson 19-Apr-2009 [1791] | ok. I follow your extraction of rules idea & this is what you had in your original suggestion. Now I am getting more familiar with what I am looking at I can understand the benefit of that & will start to work that way now. [copy .. [" ip address" to end]] was to get the interface address in the interface section of the file. It is identified by 1) some line after the line containing "interface" 2) at the begining of the line always starting with one space before the word "ip address" 3) before any line with a non-blank first char unless it is a new instance of "interface" (hence my IntFlag which Steeve didn't like my method of use) I found from testing that [to "ip address" copy temp to end] or [to " ip address" copy temp to end] found the string anywhere in the line, but [copy .. [" ip address" to end]] only finds the string if it is at the start of the line which is what I was trying to achieve. Have I made a mistake here & need to retest my assumptions perhaps? I always appreciate lots of different views on issues so I am loving the multiple responses. Sunanda you have reminded me about line numbers. I will tackle them after the extraction of rules I think, as I want them in my output for data output quality & validation checking. I have been looking at your parse-ini.r to see how you have read a file into a Rebol block, but I may stick with read/line for a bit longer while get my head round parsing each line in turn. I get the impression that once I have a final block of code there will be someone who can turn it into 2 short lines including a built in Easter egg game. |
sqlab 19-Apr-2009 [1792] | No, you are right. If there is always one leading space after newline identifying a valid ip address, your approach is the best. I just don't know anything about the stringent syntax rules of your config files, hence my suggestion. |
mhinson 19-Apr-2009 [1793x2] | Thanks. These files are getting better with more recent versions of Cisco IOS but sometimes trial and error is the only way to find the formats used. |
I am still new and confused. where can I read about how to do this please? file: "%file.txt" host: "Router1" interface: "fa0" i: 25 description: [] ipaddr: "1.1.1.1 255.255.255.0" write/append/string %/c/temp/result.log [file tab i tab host tab interface tab description tab ipaddr newline] I want the output file to be a tab-seperated set of values but all I get is the text filetabitabhosttabinterfacetabdescriptiontabipaddrnewline | |
Oldes 19-Apr-2009 [1795x3] | write/append/string %/c/temp/result.log rejoin [file tab i tab host tab interface tab description tab ipaddr newline] |
andor just simple REDUCE instead of REJOIN should be enough in this case | |
Also you must use MOLD for the values, if you want to keep the type (for example the block as the description) | |
mhinson 19-Apr-2009 [1798x2] | Thanks very much I have seen reduce & rejoin & mold but didnt realise it was relevant to writing a file.. this is the first time I have ever written to a file. |
I have tried to understand & take on what I have been told, thanks. Is this worse or better. It does what I was looking to do & I know how to extend it in the same structure. I am sure it would be educational for me if anyone has time to tear it to shreds please. Should I stop using read/line now? Would I get the benefit still? Or is the requirement too fragmented for this approach now? Should I use functions anywhere instead? Have I initialised my variables in the right & appropriate way? filename: copy %/c/temp/cisco.txt ;; cisco config file outFile: copy %/c/temp/outFile.log ;; tab separated output hostname: copy [] interface: copy [] intDesc: copy [] intIpaddr: [] ipRoute: [] IntFlag: false spacer: charset " ^/" name-char: complement spacer lines: read/lines filename outInterface: [ write/append outFile reduce [filename tab i tab hostname tab interface tab intDesc tab intIpaddr newline] ] clearInterface: [ interface: copy [] intDesc: copy [] intIpaddr: [] ] interfaceRule: [ ["interface " copy temp-interface to end] ( ;; captures point-point as well if IntFlag outInterface ;; start of new interface section so output data collected previously. if IntFlag clearInterface interface: copy temp-interface print ["! found at line " i] ;; debug print current-line ;; debug IntFlag: true ) ] descRule: [ [" description " copy intDesc to end] ( if IntFlag [print current-line] ;; debug ) ] ipAddrRule: [[" ip address " copy intIpaddr to end] ( print current-line ;; debug ) ] hostnameRule: [["hostname " copy hostname to end] ( ;; "hostname" print current-line ;; debug ) ] iprouteRule: [copy iproute ["ip route" to end] ( ;; "ip route" print current-line ;; debug ) ] IntFlagRule: [copy tempZZ [name-char to end] ( ;; not space or newline. this must be out of the int section if IntFlag outInterface ;; end of interface section so output data collected. if IntFlag clearInterface if IntFlag [print "!"] ;; debug IntFlag: false ;; ) ] i: 0 foreach line lines [i: i + 1 ;; move through lines & track line number current-line: line ;; for debug output parse/all line [ ;; parse only using rules below interfaceRule ;; evaluated if "interface" found preceeded by nothing else | descRule ;; evaluate if " desc" found preceeded by nothing else | ipAddrRule ;; " ip address" | hostnameRule ;; " hostname" | iprouteRule ;; "ip route" | IntFlagRule ;; unset interface flag if no longer in interface section (no " ^/") ] ] | |
Graham 19-Apr-2009 [1800x3] | filename: copy %/c/temp/cisco.txt ;; cisco config file outFile: copy %/c/temp/outFile.log ;; tab separated output |
don't need 'copy there | |
what's this? | |
older newer | first last |