World: r3wp
[I'm new] Ask any question, and a helpful person will try to answer.
older newer | first last |
mhinson 17-Apr-2009 [1735] | :-) |
Geomol 17-Apr-2009 [1736] | You can parse strings and blocks. The /all refinement is used when parsing strings, not blocks. From your first comments, it seems, you're parsing blocks, so you don't need /all. What is lines? A string or a block? |
Henrik 17-Apr-2009 [1737] | the ife: mentions you have there are not strings that are set in the middle of things. a set-word! will register the current index in the block being parsed. |
mhinson 17-Apr-2009 [1738] | lines is from something like lines: read %file.txt or lines: {line one line2 line3} |
Geomol 17-Apr-2009 [1739x2] | Ok, you're parsing a string then. Then using /all is ok. |
Put the wanted: copy [] up front before you parse. Then drop the first or, |, just before SOME | |
Henrik 17-Apr-2009 [1741x3] | the difference between using a set-word and SET word!: parse [a b c d] [ w1: word! (probe w1) w2: word! (probe w1 probe w2) set w3 word! (probe w1 probe w2 probe w3) w4: word! (probew1 probe w2 probe w3 probe w4/1) ] |
using a get-word! will allow you to change the position of parsing. | |
basically you must remember that a dialect doesn't uphold normal REBOL syntax. | |
mhinson 17-Apr-2009 [1744] | It sounds as if I have missed the understanding of what a dialect is. |
[unknown: 5] 17-Apr-2009 [1745] | are you familiar with SQL? SQL is a form of dialect. |
mhinson 17-Apr-2009 [1746] | I know SQL in very general terms, but could not write a query |
[unknown: 5] 17-Apr-2009 [1747] | That isn't important but what is important to understand a dialect is that a dialect means of expression that is interpreted by the underlying language. Consider the following: buy two cups soda at five dollars |
mhinson 17-Apr-2009 [1748] | ok |
[unknown: 5] 17-Apr-2009 [1749] | Each of those words could be interpreted by an underlying function to create a sum of the total cost. |
Henrik 17-Apr-2009 [1750x2] | A dialect is just a block of data that is processed in a certain way by REBOL. You can't evaluate it directly, but need some kind of parser to process it. You can create all sorts of crazy languages that way. Both the first and the second arguments to PARSE are dialects. The first one is the dialect block you provide to PARSE, the second one is the dialect used to process the first dialect. :-) |
Without this understanding, PARSE is very difficult to grasp the concept of. | |
mhinson 17-Apr-2009 [1752] | so I see.. If i do a "source parse" it says it is a native function, so at which point does the user form a dialect? |
Henrik 17-Apr-2009 [1753x4] | but unlike other languages that have to use regexp as a crutch for parsing, REBOL can use other methods than PARSE to process dialects. PARSE is useful in some cases and in other cases, there are better/simpler approaches for processing your dialect. |
You are in dialect territory immediately when you are defining a block of data and wanting to do something other than evaluating it with DO. | |
Want to create a dialect using only integers? Fine: >> total: 0 == 0 >> parse [1 2 3 4] [4 [set num integer! (total: total + num)]] == true >> total == 10 | |
But in another place, [1 2 3 4] might mean something entirely different. It also happens to evaluate as normal REBOL code: >> do [1 2 3 4] == 4 But of course it doesn't do much. :-) Code is data and data is code. | |
mhinson 17-Apr-2009 [1757] | Perhaps I should go back to trying to form a program specification & see if the advice I get in that context is different. If I have print "hello world" that seems to follow syntax rules shown by "source print" are you saying because I could have >> hi: [print "hello world"] == [print "hello world"] >> do hi hello world I have started using a dialect? |
Henrik 17-Apr-2009 [1758x2] | No, when using DO, it will not be a dialect, just normal REBOL code. Before you do anything with it, the block is just a chunk of data. A dialect involves some kind of processor that you write or exists in REBOL already, which you then apply to the chunk of data, but is not the base scanner (the main language parser). |
An example where DO wouldn't work would be VID (the graphics user interface system, Visual Interface Dialect) do [button "Hello world!"] ; gives an error layout [button "Hello world!"] ; returns a meaningful result, because the block was parsed as a dialect. | |
mhinson 17-Apr-2009 [1760x2] | Sorry, I am not getting it at all. |
what you seem to be showing me look like what I would call functions or procedures. I cant understand the destinction yet. | |
sqlab 17-Apr-2009 [1762x2] | To make it short It is like a new language in your programming language. It can be just an add-on to your normal language with enhancement or a totally different language |
So parse has it's own language, that sometimes resembles to rebol, sometimes is different. And you can make your own languages or dialects too | |
mhinson 17-Apr-2009 [1764x2] | it sounds like a very flexiable concept, but likely to add complexity. parse seems to be well documented in terms of how a string can be split apart in this manner >> probe parse {Hello world} none ["Hello" "world"] but much less documented when trying to do complex stuff... I was in my ignorance expecting it to follow some sorts of syntax rules that I could read about. Have I missed a basic concept? |
I am only thinking about finding & extracting data here, not parsing for commas or html tags etc. | |
sqlab 17-Apr-2009 [1766] | http://www.rebol.com/docs/core23/rebolcore-15.html |
Henrik 17-Apr-2009 [1767] | I know the following sounds basic, but it's _crucial_ to understanding how REBOL works, otherwise you will not "get" REBOL: You must understand the concept that data is code and code is data. That means that anything you write, can be considered pure data. It's what you do with that data that becomes important. It's like speaking a sentence, but only paying attention to the placement of letters. That makes the sentence pure data. If you then pay attention to the words objectively, they can form a sentence, so you can validate its syntax. If you use the sentence in a context, you can apply meaning to it, subjectively. If you switch the context, the sentence can mean something entirely different. This is very important. Context and meaning. For REBOL: [I have a cat] This is a block with 4 words. It's pure data that can be stored in memory, but at that level it doesn't make any sense to REBOL. If you then apply a function to that data, you can process it. DO processes that data as REBOL code. It will be evaluated as REBOL code. Here it will produce an error, because it's not valid REBOL code. If you produce your own dialect, for example with PARSE, you can make that block make sense. When typing in the console, REBOL evaluates it as normal REBOL code by using DO internally. That means: >> now == 17-Apr-2009/18:13:14+2:00 is the same as: do [now] But this block: [now] is just pure innocent data with no meaning. |
mhinson 17-Apr-2009 [1768] | you are all very kind to spend so much time helping me with this. |
Brock 17-Apr-2009 [1769x3] | 7.4 Marking Input: in the link provided by sqlab explains the use of set-words in the parse dialect. I needed to use this technique to strip out large comments from web logs that I was parsing and passing into a database. I was able to remove the large comment and replace it with a default string indicating "comments have been removed" |
One of the methods I used to parse lines of data was as follows: lines: read/lines %file.txt foreach line lines[ parse line [parse rule here] ] | |
This way you are only dealing with a small amount of data for each parse and might make it easier to visualize for you. | |
Pekr 17-Apr-2009 [1772] | mhinson: now my explanations to some of your questions, as I think not everything was explained to you properly: 1) parse/all - /all refinement means, that string is parsed "as-is", because without the /all, white-space is skipped: >> parse "this white dog" ["this" "white" "dog"] == true >> parse/all "this white dog" ["this" "white" "dog"] == false >> parse/all "this white dog" ["this" " " "white" " " "dog"] == true I prefer to always use /all refinement for string parsing ... 2) i don't understand why there is | before "some", that code will not work imo ... 3) "ifa:" is a marker. Think about parse in following terms ... you have your data, here a string. Parse is the matching engine, which tries to match your input string according to given rules. In parse context (dialect) you have no means of how to manipulate the input string, except the copy. So markers are usually used, when you want to mark some position, then do something in parens, and then get back the position, or simply mark start: .... then somewhere later end: and in the paren (copy/part start end) to copy the text between the two marked positions ... 4) "skips till one of the OR conditions are met" - very well understood ... 5) Here's slight modification for append/only stuff. Type "help append" in the console. /only appends block value as a block. You will understand that, once you will need such behaviour, so far it can look kind of academic to you :-) I put parens there, to make more obvious, what parameters are consumed by what function .... >> wanted: copy [] == [] >> append (append wanted (copy/part "12345" 3)) interf: copy ["abc"] == ["123" "abc"] >> wanted: copy [] == [] >> append/only (append wanted (copy/part "12345" 3)) interf: copy ["abc"] == ["123" ["abc"]] |
mhinson 17-Apr-2009 [1773] | Thanks very much again for so much help, I am very gratefull for the time you have spent helping me with this. A bit of a light is beginning to come on.. so outside of the parse dialect we have this syntax result: copy "hello" but inside parse we have a different syntax for copy Once I realised that I felt much less confused & set about experimenting with to & thru in the context of copy within parse. Perhaps these results will be of interest to other noobs, although I mustt say actualy typing them in helped me appreciate what was happening. parse {ab hello cd} [copy result "a" "o"] ;; returns "a" parse {ab hello cd} [copy result to "a" "o"] ;; returns none parse {ab hello cd} [copy result "a" to "o"] ;; returns "a" parse {ab hello cd} [copy result to "a" to "o"] ;; returns none parse {ab hello cd} [copy result thru "a" "o"] ;; returns "a" parse {ab hello cd} [copy result "a" thru "o"] ;; returns "a" parse {ab hello cd} [copy result thru "a" thru "o"] ;; returns "a" parse {ab hello cd} [copy result "h" "o"] ;; returns [] parse {ab hello cd} [copy result to "h" "o"] ;; returns "ab " parse {ab hello cd} [copy result "h" to "o"] ;; returns [] parse {ab hello cd} [copy result to "h" to "o"] ;; returns "ab " parse {ab hello cd} [copy result thru "h" "o"] ;; returns "ab h" parse {ab hello cd} [copy result "h" thru "o"] ;; returns [] parse {ab hello cd} [copy result thru "h" thru "o"] ;; returns "ab h" parse {ab hello cd} [copy result ["a" "o"]] ;; returns [] parse {ab hello cd} [copy result [to "a" "o"]] ;; returns [] parse {ab hello cd} [copy result ["a" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [to "a" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [thru "a" "o"]] ;; returns [] parse {ab hello cd} [copy result ["a" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result [thru "a" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result ["h" "o"]] ;; returns [] parse {ab hello cd} [copy result [to "h" "o"]] ;; returns [] parse {ab hello cd} [copy result ["h" to "o"]] ;; returns [] parse {ab hello cd} [copy result [to "h" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [thru "h" "o"]] ;; returns [] parse {ab hello cd} [copy result ["h" thru "o"]] ;; returns [] parse {ab hello cd} [copy result [thru "h" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result [thru "a" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result [thru "h" to "o"]] ;; returns "ab hell" parse {ab hello cd} [copy result thru "a" to "o"] ;; returns "a" parse {ab hello cd} [copy result thru "h" to "o"] ;; returns "ab h" parse {ab hello cd} [copy result [to "a" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result [to "h" thru "o"]] ;; returns "ab hello" parse {ab hello cd} [copy result to "a" thru "o"] ;; returns none parse {ab hello cd} [copy result to "h" thru "o"] ;; returns "ab " parse {ab hello cd} ["h" copy result "o"] ;; returns [] parse {ab hello cd} [to "h" copy result "o"] ;; returns [] parse {ab hello cd} ["h" copy result to "o"] ;; returns [] parse {ab hello cd} [to "h" copy result to "o"] ;; returns "hell" parse {ab hello cd} [thru "h" copy result "o"] ;; returns [] parse {ab hello cd} ["h" thru copy result "o"] ;; returns [] parse {ab hello cd} [thru "h" copy result thru "o"] ;; returns "ello" parse {ab hello cd} ["a" copy result "o"] ;; returns [] parse {ab hello cd} [to "a" copy result "o"] ;; returns [] parse {ab hello cd} ["a" copy result to "o"] ;; returns "b hell" parse {ab hello cd} [to "a" copy result to "o"] ;; returns "ab hell" parse {ab hello cd} [thru "a" copy result "o"] ;; returns [] parse {ab hello cd} ["a" copy result thru "o"] ;; returns "b hello" parse {ab hello cd} [thru "a" copy result thru "o"] ;; returns "b hello" parse {ab hello cd} [thru "a" copy result to "o"] ;; returns "b hell" parse {ab hello cd} [thru "h" copy result to "o"] ;; returns "ell" parse {ab hello cd} [to "a" copy result thru "o"] ;; returns "ab hello" parse {ab hello cd} [to "h" copy result thru "o"] ;; returns "hello" |
Oldes 17-Apr-2009 [1774x2] | parse/all {ab hello cd} [3 skip copy result 5 skip to end] ;; returns "hello" parse/all {ab hello cd} [thru #" " copy result to #" " to end] ;; returns "hello" |
(you don't have to use the [to end] in my versions if you don't care that parse returns false instead of true | |
mhinson 18-Apr-2009 [1776] | I have written my first bit of code that is starting to do something useful. All the bad bits are mine & all the good bits are from the help I have been given here. My main intention is to start off with code that I can understand & develop so any criticism would be most welcome. My next step is to remove the debug code & replace it with code that stores all the information in a structured form for searching & further analysis. Thanks for all your help with this. filename: copy %/c/temp/cisco.txt ;; cisco config file host: copy [] interface: copy [] intDescription: copy [] intIpAddress: [] ipRoute: [] IntFlag: false spacer: charset " ^/" name-char: complement spacer lines: read/lines filename foreach line lines [ ;; move through lines parse/all line [copy temp ["interface" to end] ( ;; evaluated if "interface" found preceeded by nothing else interface: copy temp print interface ;; debug code IntFlag: true) | copy temp2 [" desc" to end] ( ;; evaluate if " desc" found preceeded by nothing else if IntFlag [print temp2] ;; debug ) | copy temp3 [" ip address" to end] ( ;; " ip address" print temp3 ;; debug ) | copy temp4 ["hostname" to end] ( ;; "hostname" print temp4 ;; debug ) | copy temp5 [name-char to end] ( ;; any char except space or newline. this must be last ; if IntFlag [print temp5] ;; debug if IntFlag [print "!"] ;; debug IntFlag: false ) ] ] ;###################################### the input file contains these lines which are extracted (except the !) plus it has a load more lines that are ignored at the moment. hostname pig interface Null0 ! interface Loopback58 description SLA changed this ! interface ATM0 ! interface ATM0.1 point-to-point ! interface FastEthernet0 description my first port ! interface FastEthernet1 description test1 ! interface FastEthernet2 description test2 ! interface FastEthernet3 ! interface Dot11Radio0 ! interface Vlan1 description User vlan (only 1 vlan allowed) ! interface Dialer0 description $FW_OUTSIDE$ ip address negotiated ! interface BVI1 description $FW_INSIDE$ ip address 192.168.0.1 255.255.255.0 ! !########### end ########## |
Steeve 18-Apr-2009 [1777] | What a waste... Are you sure you understand well the idea behind parsing ? It's not specific to rebol, Parsing exists in many computer langages, At first you have to understand the theory behind... If not, you will just produce trash code.like that |
[unknown: 5] 19-Apr-2009 [1778] | mhinson, don't be discouraged by Steeve's lack of politeness. I assure you that we are not all this way. Just be sure to ask questions. |
Pekr 19-Apr-2009 [1779] | Steeve - why a waste? REBOL's parse allows even lamers like me to produce the result, which in the end does what I want it to do, but you surely would not like to see my parse rules :-) I can't write single piece of regexp, yet REBOL's parse is usefull to me. |
Henrik 19-Apr-2009 [1780x2] | it would be interesting if the config file could be loaded, by making unloadable parts like $FW_OUTSIDE$ loadable using simple string replacable. Then you could just 'load the file into a block and it would be considerably easier to parse. |
string replacable = string replacement | |
Steeve 19-Apr-2009 [1782] | Pekr, I'm just disapointed by what mhinson produced after getting so many advices from Rebolers like you. The read/line trick is very useless, why doesn't he use the standard way of traversing newlines with parse ? And why using code (inside parents) to manage optional rules. Are commands like SOME, ANY, OPT not enough to manage simple rules like that ? |
Henrik 19-Apr-2009 [1783] | I'm just admiring that mhinson wasn't scared of jumping into PARSE so soon. :-) |
mhinson 19-Apr-2009 [1784] | Thanks for the feed back, all is most welcome. I will try to avoid read/line if it is bad, is there a list of things I can't expect to load? Should I convert them to some symbolic value & then convert them back again for the final output? I don't yet understand why a block would be easier to parse than lines, by easier do you mean more efficient or easier to create the code? The optional rules (inside parents) are to change the behavior based on lines read previously so I don't yet understand any concept that would let me avoid those. I need the code to be very simple (like me) so I can understand how it is operating. I know my implementation goes against the Rebol ethos of small & efficient but perhaps in time I can understand enough to make it so & also start using relative expressions properly so it can be simple to understand. |
older newer | first last |