Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

'Parse is peculiar!

 [1/18] from: shannon:ains:au at: 14-Dec-2000 17:05


Hi REBOL Community, The REBOL philosophy goes "Simple things should be simple". Well I have to say that the 'parse function is an exception! I've had to use it extensively for parsing log files but it has literally taken me months to do simple things - compared to a few weeks for the rest of the core set. Normally I go over the manual entry extensively looking for clues as to what is going wrong, this time i'm really stuck. Look at this: REBOL/View 0.10.38.3.1 28-Nov-2000 Copyright 2000 REBOL Technologies. All rights reserved.
>> digits: charset "0123456789"
== make bitset! #{000000000000FF0300000... [snip] ...000}
>> line1: {Lets find "Julie<1234>"} >> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end]
Julie<1234> == true but what if I don't want to include the <xxxx> in 'name?
>> parse line1 [thru {"} copy name [to {<} 4 digits {>} (print name)] to end]
-------------------------------------| Note the 'thru changed to 'to == false That doesn't make sense. Now the second problem:
>> line2: {Lets find "J<o>hn<1234>"} >> parse line2 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end]
== false So there's the second problem, parse seems to get stuck on '<o>'. I would assume the sub rule should only match a string containing '<xxxx>' and x is a digit. Please don't ask me to use 'find or to change the data structure, or to parse the results twice. I want to understand why 'parse doesn't return the results I expect. Finally the follow line causes the rebol console to hang:
>>parse line2 [thru {"} copy name [some [to {<}]] to end]
and once again I can't get my head around it. Clearly the manual needs more detail on 'parse. SpliFF

 [2/18] from: petr:krenzelok:trz:cz at: 14-Dec-2000 8:33


Shannon Baker wrote:
> Hi REBOL Community, > The REBOL philosophy goes "Simple things should be simple". Well I have to say
<<quoted lines omitted: 11>>
> Julie<1234> > == true
huh, what? tried above code and it fails ;-) (print name) has to be after the closing bracket, or you get: ->> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end] ** Script Error: name has no value ** Near: print name
> but what if I don't want to include the <xxxx> in 'name? > > >> parse line1 [thru {"} copy name [to {<} 4 digits {>} (print name)] to end] > -------------------------------------| Note the 'thru changed to 'to > == false >
hmm, why is the result false you ask? Well, because by stating "to {<}" parser stops just in front of < char, so to turn it into 'true you would have to do it following way: ->> parse line1 [thru {"} copy name [to {<} {<} 4 digits {>}] (print name) to end] Julie<1234> == true
> That doesn't make sense. Now the second problem:
Does it make more sense? Well, you wanted just the name, so why to use the strange syntax above? What about following code: ->> parse line1 [thru {"} copy name to {<} (print name) to end] Julie == true
> >> line2: {Lets find "J<o>hn<1234>"} > >> parse line2 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end] > == false > > So there's the second problem, parse seems to get stuck on '<o>'. I would assume > the sub rule should only match a string containing '<xxxx>' and x is a digit.
Well, and you are right. It doesn't match and that's why you got 'false, no? What's the problem here? ->> alpha: charset [#"a" - #"z" #"A" - #"Z"] == make bitset! #{ 0000000000000000FEFFFF07FEFFFF0700000000000000000000000000000000 } ->> parse line2 [thru {"} copy name [any [some alpha | 4 digits | {>} | {<}]] (print name) to end] J<o>hn<1234> == true I know it's not so powerful expression, as it will also match any combination of sequences of alphas, 4 digits, and separately <, >, so it would match even some <>Jo1234><1234> to better suite your needs:
>> parse line2 [thru {"} copy name [some alpha {<} some [some alpha {>} some alpha
{<} 4 digits {>} | 4 digits {>}]] (print name) to end] J<o>hn<1234> == true
> Please don't ask me to use 'find or to change the data structure, or to parse the > results twice. I want to understand why 'parse doesn't return the results I > expect. > > Finally the follow line causes the rebol console to hang: > > >>parse line2 [thru {"} copy name [some [to {<}]] to end]
:-) well, just because by using "to" statement you will be put in front of "<" and then again and again and again ... untill you finally skip damned "<" ;-) šjust change it to "thru" or try to match "<" Cheers, -pekr-

 [3/18] from: al:bri:xtra at: 14-Dec-2000 21:35


> but what if I don't want to include the <xxxx> in 'name? > > >> parse line1 [thru {"} copy name [to {<} 4 digits {>} (print name)] to
end]
> -------------------------------------| Note the 'thru changed to 'to > == false > > That doesn't make sense.
'to recognises the "<" and stops just before the "<" in the string. You'll need to use: thru "<" to get "thru" it.
> Now the second problem: > > >> line2: {Lets find "J<o>hn<1234>"} > >> parse line2 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to
end]
> == false > > So there's the second problem, parse seems to get stuck on '<o>'. I would
assume the sub rule should only match a string containing '<xxxx>' and x is a digit. No, the rule hasn't yet got there. {Lets find "J<o>hn<1234>"} ; Before [thru {"} {J<o>hn<1234>"} ; After [thru {"} thru {<} {o>hn<1234>"} ; After [thru {<} 4 digits Parse fails here because "o" (lower case letter "O") isn't a character that matches the 'digits bitset.
> Finally the follow line causes the rebol console to hang: > > >>parse line2 [thru {"} copy name [some [to {<}]] to end] > > and once again I can't get my head around it.
{Lets find "J<o>hn<1234>"} ; Before [some [to {<}]] {<o>hn<1234>"} ; After Note that 'to doesn't go "thru" it's match. 'some specifies one or more matches of the thing to the right. Because 'to doesn't go through it's match, 'some keeps repeating the 'to "<" as it keeps matching (and not going through the input). So you have an infinite loop, so the rebol console hangs. I believe this is called "greedy parsing".
> Clearly the manual needs more detail on 'parse.
I agree. Rebol is still in a state of flux, though. I hope that helps! Andrew Martin No longer so greedy... ICQ: 26227169 http://members.nbci.com/AndrewMartin/

 [4/18] from: al:bri:xtra at: 14-Dec-2000 21:51


SpliFF wrote:
> >> line1: {Lets find "Julie<1234>"} > >> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to
end]
> Julie<1234> > == true > > but what if I don't want to include the <xxxx> in 'name? >> parse line1 [thru {"} copy Name to "<" "<" 4 digits {>"} end (print
Name)] Julie == true A nicer way would be:
>> parse line1 [thru {"} copy Name to ["<" 4 digits {>"}] end (print Name)]
** Script Error: Invalid argument: < 4 digits >" ** Near: parse line1 [thru {"} copy Name to ["<" 4 digits {>"}] end (print Name)] but unfortunately, 'to isn't yet smart enough to understand a block of rules. Andrew Martin ICQ: 26227169 http://members.nbci.com/AndrewMartin/

 [5/18] from: al:bri:xtra at: 14-Dec-2000 22:05


> A nicer way would be: > >> parse line1 [thru {"} copy Name to ["<" 4 digits {>"}] end (print
Name)]
> ** Script Error: Invalid argument: < 4 digits >" > ** Near: parse line1 [thru {"} copy Name to ["<" 4 digits {>"}] end (print
Name)] And better still might be: parse line1 [thru {"} copy Name to ["<" 4 digits {>"}] to end (print Name)] Note the 'to before 'end. Andrew Martin My excuse is I forgot... ICQ: 26227169 http://members.nbci.com/AndrewMartin/

 [6/18] from: brett:codeconscious at: 14-Dec-2000 20:25


Howdy, I'll address your immediate questions first then make a stab at explaining what is happening.
> >> digits: charset "0123456789" > == make bitset! #{000000000000FF0300000... [snip] ...000} > >> line1: {Lets find "Julie<1234>"} > >> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to
end]
> Julie<1234> > == true > > but what if I don't want to include the <xxxx> in 'name? > > >> parse line1 [thru {"} copy name [to {<} 4 digits {>} (print name)] to
end]
> -------------------------------------| Note the 'thru changed to 'to > == false > > That doesn't make sense. Now the second problem:
It actually does make sense. Imagine a cursor on your line to {<} positions your cursor just before the {<} Then your rule says it must be followed by 4 digits, but < is not a digit so your rule fails.
> >> line2: {Lets find "J<o>hn<1234>"} > >> parse line2 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to
end]
> == false > > So there's the second problem, parse seems to get stuck on '<o>'. I would
assume
> the sub rule should only match a string containing '<xxxx>' and x is a
digit. That right. You had a rule that began with matching a <. Parse now expects a digit but you dissappointed it by giving it an o.
> Please don't ask me to use 'find or to change the data structure, or to
parse the
> results twice. I want to understand why 'parse doesn't return the results
I
> expect.
Best way to learn.
> Finally the follow line causes the rebol console to hang: > > >>parse line2 [thru {"} copy name [some [to {<}]] to end]
You put parse into an infinite loop.
> and once again I can't get my head around it. Clearly the manual needs
more
> detail on 'parse.
After many times reading it and finally getting my head around parse I realise the manual is accurate. It is maybe deficient in not getting people to think in the "right" way from the start. It takes longer to understand parse because early on you can create rules that work 90% of the time, and then all of a sudden after a small change don't work at all. The problem for me was not parse, it was how I was thinking. If you allow me a little licence, here is how I understand parse works. The rules that you give parse are like hypotheses. Imagine you develop a theory that you hope will explain the input. You give the "theory" (rules) to the Parse function to check to see if you were right. To check you rules, parse conducts experiments. It moves through the input matching what it sees with your explanation. Each rule you give parse has to complete to be successful. If it completes then the input that was explained by that rule is left behind has having been dealt with. Parse ticks off the successful rule and gets the next appropriate rule. In order to tick off compound rules - those enclosed with a "[" and "]" - parse will have to tick off each nested rule recursively. If Parse finds that the rule fails to explain the input, it will discard the rule and backtrack in the input to the point where it started trying to match the rule that failed. Then it sees if you have anything left in your theory to describe what it is seeing. If not parse returns a value to you indicating that your theory was "false". If parse runs out of input, but your theory hasn't finished (you proposed that there should be more there than there is), parse will again return a value of "false" If after processing everything, parse finds your theory was accurate you get the value "true" returned to you. Some of the valid keywords that you can put in a parse rule do not have any effect on your "theory". They exist to allow side effects to occur while parse is working through the "experiment". Ok, some example rules. Each of these is a single rule and parse will need to tick each off as being successful. Rule Description -------------------------- < Expect the string consisting of a single less-than character thru {"} Expect 1 or more characters up to and including the double-quote character to {<} Expect 1 or more characters finishing with, but not including the less-than character 4 digits Expect 4 occurrences of the pattern matched by the rule named "digits". copy name Set the word "name" to a copy of the input sequence that is matched by the very next rule. (print name) On encountering this execute it. Hopefully this line of thinking clears it up a bit. Brett.

 [7/18] from: shannon:ains:au at: 14-Dec-2000 21:25

Re: 'Parse is peculiar! - more details


Thanks Peter and Andrew, you both know your 'parse. Unfortunately your answers didn't help me with the first issue. Perhaps I need to explain the problem more clearly. I have a large collection of log files generated by a Counter-Strike games server. When a user connects a line is generated that looks like these: L 09/22/2000 - 15:37:25: "*`Ultimate_Master`*<4718>" <WON:26073391>" connected, ip 202.54.232.2 L 09/22/2000 - 15:43:10: "[IMPREA]Smart-Gun!!<4723><WON:23014199>" connected, ip 220.34.24.3 L 09/22/2000 - 15:47:30: "{[FrAg]}-MaN-<4727><WON:19220729>" connected, ip "132.76.43.24" L 09/22/2000 - 15:49:22: "<Usyd> H4XX0R<124><WON:20007739>" connected, ip "160.34.64.112" As you can see the game appears to impose few restrictions on the range of characters and spaces allowed in a name. It's also a stupid log format. This means I can't use: parse line [thru {"} copy name to {<}] ...because names like <Usyd> H4XXOR would cause parse 'copy to return 'none or a partial name. To make matters worse I just realised that the number following the name (the user id) is not restricted to 4 digits like I originally thought. Therefore I need the following behavoir from 'parse: go thru {"} then copy all text to name until the UID pattern "< some digits >" is found. I don't want the UID included in 'name though. In the lines above that would mean 'name is set to "*`Ultimate_Master`*", "[IMPREA]Smart-Gun!!", "{[FrAg]}-MaN-" and "<Usyd> H4XX0R" respectively. I don't want to go thru the UID I want to go 'to it, but still get a 'true result. Also Peter replied:
>> >> line1: {Lets find "Julie<1234>"} >> >> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end]
<<quoted lines omitted: 5>>
>** Script Error: name has no value >** Near: print name
I guarantee that this line works in the latest rebol/view (experimental) for Win9X. The version is 'REBOL/View 0.10.38.3.1 28-Nov-2000'. Which version do you have? This looks like inconsistent behavior. (print name) should execute after the closing '>' is matched. Once again I realise it may be easier to do a find/last or some other trick but i'm more interested in understanding how parse works. It would also make my code cleaner since I use the select convention to choose rules, as in: REBOL ["CS Log Parser"] ; lines omited digits: charset "0123456789" search: func ["Returns matches for search item" line [string!] "Line to search" item [string!] {Valid types are "date", "time", "user", "won-id", "ip"} ] [ if not value? 'item [item: ask {Search for? ("date", "time", "user", "won-id", "ip"): }] rules: [ "date" [thru {L } copy match to { -} to end (print match)] "time" [thru { - } copy match to {: "} to end (print match)] "user" [thru {: "} copy match to {"} to end (print match)] ; this line is wrong "ip" [thru {ip "} copy match to {"} to end] "won-id" [thru {<WON:} copy match to {>} to end] ] parse line select rules item ] ; __________example_______________________________________________________ log-line: {L 09/22/2000 - 15:49:22: "<Usyd> H4XX0R<124><WON:20007739>" connected, ip 160.34.64.112 } search log-line "user"

 [8/18] from: shannon:ains:au at: 14-Dec-2000 21:38


NOTE: I think my last reply was blocked due to having one 'Re:' too many in the sunbject line. If you already read this please ignore. ---- Message Begins ---- Thanks Peter and Andrew, you both know your 'parse. Unfortunately your answers didn't help me with the first issue. Perhaps I need to explain the problem more clearly. I have a large collection of log files generated by a Counter-Strike games server. When a user connects a line is generated that looks like these: L 09/22/2000 - 15:37:25: "*`Ultimate_Master`*<4718>" <WON:26073391>" connected, ip 202.54.232.2 L 09/22/2000 - 15:43:10: "[IMPREA]Smart-Gun!!<4723><WON:23014199>" connected, ip 220.34.24.3 L 09/22/2000 - 15:47:30: "{[FrAg]}-MaN-<4727><WON:19220729>" connected, ip "132.76.43.24" L 09/22/2000 - 15:49:22: "<Usyd> H4XX0R<124><WON:20007739>" connected, ip "160.34.64.112" As you can see the game appears to impose few restrictions on the range of characters and spaces allowed in a name. It's also a stupid log format. This means I can't use: parse line [thru {"} copy name to {<}] ...because names like <Usyd> H4XXOR would cause parse 'copy to return 'none or a partial name. To make matters worse I just realised that the number following the name (the user id) is not restricted to 4 digits like I originally thought. Therefore I need the following behavoir from 'parse: go thru {"} then copy all text to name until the UID pattern "< some digits >" is found. I don't want the UID included in 'name though. In the lines above that would mean 'name is set to "*`Ultimate_Master`*", "[IMPREA]Smart-Gun!!", "{[FrAg]}-MaN-" and "<Usyd> H4XX0R" respectively. I don't want to go thru the UID I want to go 'to it, but still get a 'true result. Also Peter replied:
>> >> line1: {Lets find "Julie<1234>"} >> >> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end]
<<quoted lines omitted: 5>>
>** Script Error: name has no value >** Near: print name
I guarantee that this line works in the latest rebol/view (experimental) for Win9X. The version is 'REBOL/View 0.10.38.3.1 28-Nov-2000'. Which version do you have? This looks like inconsistent behavior. (print name) should execute after the closing '>' is matched. Once again I realise it may be easier to do a find/last or some other trick but i'm more interested in understanding how parse works. It would also make my code cleaner since I use the select convention to choose rules, as in: REBOL ["CS Log Parser"] ; lines omited digits: charset "0123456789" search: func ["Returns matches for search item" line [string!] "Line to search" item [string!] {Valid types are "date", "time", "user", won-id , "ip"} ] [ if not value? 'item [item: ask {Search for? ("date", "time", "user", won-id , "ip"): }] rules: [ "date" [thru {L } copy match to { -} to end (print match)] "time" [thru { - } copy match to {: "} to end (print match)] "user" [thru {: "} copy match to {"} to end (print match)] ; this line is wrong "ip" [thru {ip "} copy match to {"} to end] "won-id" [thru {<WON:} copy match to {>} to end] ] parse line select rules item ] ; __________example_______________________________________________________ log-line: {L 09/22/2000 - 15:49:22: "<Usyd> H4XX0R<124><WON:20007739>" connected, ip 160.34.64.112 } search log-line "user" SpliFF

 [9/18] from: brett:codeconscious at: 14-Dec-2000 22:36


> L 09/22/2000 - 15:37:25: "*`Ultimate_Master`*<4718>" <WON:26073391>"
connected, ip
> "202.54.232.2" > L 09/22/2000 - 15:43:10: "[IMPREA]Smart-Gun!!<4723><WON:23014199>"
connected, ip
> "220.34.24.3" > L 09/22/2000 - 15:47:30: "{[FrAg]}-MaN-<4727><WON:19220729>" connected, ip
132.76.43.24
> L 09/22/2000 - 15:49:22: "<Usyd> H4XX0R<124><WON:20007739>" connected, ip
160.34.64.112
> As you can see the game appears to impose few restrictions on the range of
characters and
> spaces allowed in a name. It's also a stupid log format.
Hmm. I agree really ugly log format... It really isn't the best example for getting confident on parse. But I can see it would be really good to get it to work :)
>Therefore I need the > following behavoir from 'parse: > > go thru {"} then copy all text to name until the UID pattern "< some
digits >" is found.
> I don't want the UID included in 'name though. In the lines above that
would mean 'name
> is set to "*`Ultimate_Master`*", "[IMPREA]Smart-Gun!!", "{[FrAg]}-MaN-"
and "<Usyd>
> H4XX0R" respectively. > > I don't want to go thru the UID I want to go 'to it, but still get a 'true
result. Is that log line you gave for Ultimate_Master accurate? There seems to be a double-quote after the UID. The problem that I can see is that somebody might have a name like <4444>. They might get really wierd and do something like <4444><4444> as a name as well. The only consistency I can see for determining what is really after the name is the string that starts <WON: So I reckon on the basis of what I've seen, the only way to get the name accurately is to find the UID accurately and to do that requires searching backwards from before the <WON: for a <UID>. The problem here is parse won't do this neatly in one go. One idea then is to get parse to caputure from the beginning of the name throught to the <WON and then call itself to process the internal bit - my worry with this is to know if parse is reenterant or not. Or better use parse to do as before but instead of calling itself, use find instead. I know you are trying to avoid find, but it may actually be the simplest to code and understand. Alternatively you could probably get parse to backtrack in a fashion by creating a truly evil parse rule. I managed to do this once but as I said - it is truly evil (modifies the input stream and other horrors). :) Anyway I'll wait to hear your thoughts and to find out if that double-quote was real or not. Brett.

 [10/18] from: emptyhead:home:nl at: 14-Dec-2000 12:32


Because the game doesn't check the characters being inserted in the name the grammar of the logfile is not correct and not parseable thru a left to right parser. (not with this 'parse function). This works fine if the plater does not have a name with the string {>"} r-tag: [ ">" thru "<" ] parse line [ thru {"} copy name thru {>"} ( ; parse the name in reverse. reverse name parse name [ {"} copy won r-tag copy num r-tag copy name to end (reverse name reverse won reverse num) ] ) to end ] You can add more rules to parse the ip-numer and date. Daan Oosterveld Shannon Baker schreef:

 [11/18] from: ingo:2b1 at: 14-Dec-2000 13:53


Hi Brett, here's my take, assuming that "<WON:" is sure not to be in the name-string (and it is the only thing you can be sure about ... line1: {Lets find "Ju<l>ie<1234><WON:90966776>"} main-rule: [ (next-part: "") thru {"} copy name to {<} some sub-rule to end ] sub-rule: [ "<WON:" (print name) to end | (if next-part <> "" [append name join "<" next-part] ) skip copy next-part to "<" ]
>> parse line1 main-rule
Ju<l>i<97868769> OK, what do I do? in main-rule first create an empty-string next-part (no copy needed, as the string is not changed, the word gets assigned to new strings later on) Up to copy name it's the same as before, but then we go into the sub-rule. when entering sub-rule we are right before the "<" so we can test if it is <WON: by chance, if it is, w're done, print the name, and go to the end. If it's not "<WON:" we first test if next-part is still "" (first pass) and we either do nothing, or append a "<" and next-part to the name", then we skip over the "<" (remember? that's where we entered the sub-rule!) and then we copy the next-part of the name to the next "<" the rule ends here, but we've told parse to check this rule multiple times (some), so we just reenter the sub-rule ... I hope this helps, Ingo Once upon a time Brett Handley spoketh thus:
> > L 09/22/2000 - 15:37:25: "*`Ultimate_Master`*<4718>" <WON:26073391>" > connected, ip
<<quoted lines omitted: 54>>
> [rebol-request--rebol--com] with "unsubscribe" in the > subject, without the quotes.
-- do http://www.2b1.de/ _ . _ ingo@)|_ /| _| _ <We ARE all ONE www._|_o _ _ ._ _ www./_|_) |o(_|(/_ We ARE all FREE> ingo@| |(_|o(_)| (_| ._| ._|

 [12/18] from: arolls::bigpond::net::au at: 15-Dec-2000 1:18


Shannon, If name has a value before the parse, then there is no error and it returns true. Can you check from a fresh start? digits: charset "0123456789" line1: {Lets find "Julie<1234>"} parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end] Anton.

 [13/18] from: vodalee::gte::net at: 14-Dec-2000 5:06

Re: 'Parse is peculiar!


I hate to butt in but I just stumbled across REBOL while surfing thru an ice storm here in Texas. If you would like an example of 'A PARSE' function take a look at OBJREXX. www2.hursley.ibm.com For string handling, it's hard to beat REXX on any box. The first thing that impressed me about REBOL -- they evidently have the Gregorian Calendar right -- something Micro$oft, H_P et al. never have done. Date functions are the first thing I check in any Computer Language. Also, REBOL's e-mail handling techniques appear useful. Bob Hamilton Richardson, Texas [mail--bobh--to]

 [14/18] from: allenk:powerup:au at: 15-Dec-2000 7:00

Re: 'Parse is peculiar! - more details


Why not read/lines and then use entry: parse line {"} == ["L" "09/22/2000" "-" "15:49:22:" "<Usyd> H4XX0R<124><WON:20007739>" connected, "ip" "160.34.64.112"] This will give you a consistent 8 part format. Can then use entry/1 entry/2 etc for simple access to the results. Cheers, Allen K

 [15/18] from: brett:codeconscious at: 15-Dec-2000 12:00


Hi Ingo, Good one! I stand corrected, you don't need something evil to achieve it, just another way of looking at it. Your code fails on the fourth example though, because name has a value of none. Just making a small change, it works sub-rule: [ "<WON:" (print name) to end | ( if next-part <> "" [if not name [name: copy ""] append name join "<" next-part] ) skip copy next-part to "<" ] Brett

 [16/18] from: ingo:2b1 at: 15-Dec-2000 8:19


Hi Brett, found an error in my last post, it didn't work for "<<" line1: {Lets find "Ju<<<l>ie<1234><WON:90966776>"} main-rule: [ (next-part: "") thru {"} copy name to {<} some sub-rule to end ] sub-rule: [ "<WON:" (print name) to end | (if next-part <> "" [if none? next-part [next-part: ""] append name join "<" next-part] ) skip copy next-part to "<" ]
>> parse line1 main-rule
Ju<<<l>i (if "<" are directly following each other, name-part is set to none, so I have to change to "") kind regards, Ingo Once upon a time Ingo Hohmann spoketh thus:
> Hi Brett, > here's my take, assuming that "<WON:" is sure not to be in the name-string
<<quoted lines omitted: 19>>
> I hope this helps, > Ingo
-- YES! That's just me, just being! http://www.2b1.de/ We ARE all ONE --- [ingo--2b1--de] --- We ARE all FREE

 [17/18] from: shannon::ains::net::au at: 16-Dec-2000 20:16

'Parse is peculiar! - fresh start


REBOL/View 0.10.38.3.1 28-Nov-2000 Copyright 2000 REBOL Technologies. All rights reserved.
>> do {
{ digits: charset "0123456789" { line1: {Lets find "Julie<1234>"} { parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end]} ** Script Error: name has no value. ** Where: print name
>> parse line1 [thru {"} copy name [thru {<} 4 digits {>}] (print name) to end]}
Julie<1234> You're right Anton. 'name is assigned outside the sub-block and can't be referenced in it. Either the usual rebol word-within-context system doesn't apply to parse or it doesn't assign name until it reachs the last ']' in the sub-block. I assumed it would work similar to this:
>>name: "Anton" do [print name]
Anton Anton wrote:

 [18/18] from: arolls:bigpond:au at: 16-Dec-2000 23:16


Shannon,
> You're right Anton. 'name is assigned outside the sub-block and can't be > referenced in it. Either the usual rebol word-within-context > system doesn't > apply to parse or it doesn't assign name until it reachs the last > ']' in the > sub-block.
That's not why! :) There is no such limitation. It's the mistake pointed out earlier by Pekr. The (print name) *should* be after the final end-bracket ] in the sub-rule [thru {<} 4 digits {>}]. The copy expects a successful sub-rule after 'name, before it assigns 'name a value. But the sub-rule tries print out 'name first. This is like trying to say:
>> name: rejoin ["<1234>" name]
** Script Error: name has no value. ** Where: name You get this error because rejoin ["<1234>" name] happens first. 'name has no value and therefore can't be evaluated. You probably didn't catch this problem with your advanced parse because 'name was set a value in an earlier test. Try this out: line1: "Ju<li>e<><><<1234>" id: [thru "<" 3 4 digits ">"] rule: [a: some [id | [skip b:]] (print copy/part a b)] parse line1 rule 'a is set to the beginning of the input. 'some keeps trying to match the 'id. If it can't, which is true for the first 12 characters, it just 'skips over a character and sets 'b to that position. After it has got through the guts of 'rule, it prints out all the stuff between 'a and 'b. Anton.

Notes
  • Quoted lines have been omitted from some messages.
    View the message alone to see the lines that have been omitted