[REBOL] Re: 'Parse is peculiar! - more details
From: shannon:ains:au at: 14-Dec-2000 21:25
Thanks Peter and Andrew, you both know your 'parse. Unfortunately your answers didn't
help me with the first issue. Perhaps I need to explain the problem more clearly. I have
a large collection of log files generated by a Counter-Strike games server. When a user
connects a line is generated that looks like these:
L 09/22/2000 - 15:37:25: "*`Ultimate_Master`*<4718>" <WON:26073391>" connected, ip
202.54.232.2
L 09/22/2000 - 15:43:10: "[IMPREA]Smart-Gun!!<4723><WON:23014199>" connected, ip
220.34.24.3
L 09/22/2000 - 15:47:30: "{[FrAg]}-MaN-<4727><WON:19220729>" connected, ip "132.76.43.24"
L 09/22/2000 - 15:49:22: "<Usyd> H4XX0R<124><WON:20007739>" connected, ip "160.34.64.112"
As you can see the game appears to impose few restrictions on the range of characters
and
spaces allowed in a name. It's also a stupid log format. This means I can't use:
parse line [thru {"} copy name to {<}]
...because names like <Usyd> H4XXOR would cause parse 'copy to return 'none or a partial
name. To make matters worse I just realised that the number following the name (the user
id) is not restricted to 4 digits like I originally thought. Therefore I need the
following behavoir from 'parse:
go thru {"} then copy all text to name until the UID pattern "< some digits >" is found.
I don't want the UID included in 'name though. In the lines above that would mean 'name
is set to "*`Ultimate_Master`*", "[IMPREA]Smart-Gun!!", "{[FrAg]}-MaN-" and "<Usyd>
H4XX0R" respectively.
I don't want to go thru the UID I want to go 'to it, but still get a 'true result.
Also Peter replied:
>> >> line1: {Lets find "Julie<1234>"}
>> >> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end]
>> Julie<1234>
>> == true
>
>huh, what? tried above code and it fails ;-) (print name) has to be after the closing
>bracket, or you get:
>
>->> parse line1 [thru {"} copy name [thru {<} 4 digits {>} (print name)] to end]
>** Script Error: name has no value
>** Near: print name
I guarantee that this line works in the latest rebol/view (experimental) for Win9X. The
version is 'REBOL/View 0.10.38.3.1 28-Nov-2000'. Which version do you have? This looks
like inconsistent behavior. (print name) should execute after the closing '>' is matched.
Once again I realise it may be easier to do a find/last or some other trick but i'm more
interested in understanding how parse works. It would also make my code cleaner since
I
use the select convention to choose rules, as in:
REBOL ["CS Log Parser"]
; lines omited
digits: charset "0123456789"
search: func ["Returns matches for search item"
line [string!] "Line to search"
item [string!] {Valid types are "date", "time", "user", "won-id", "ip"} ]
[
if not value? 'item [item: ask {Search for? ("date", "time", "user", "won-id", "ip"):
}]
rules: [
"date" [thru {L } copy match to { -} to end (print match)]
"time" [thru { - } copy match to {: "} to end (print match)]
"user" [thru {: "} copy match to {"} to end (print match)] ; this line
is wrong
"ip" [thru {ip "} copy match to {"} to end]
"won-id" [thru {<WON:} copy match to {>} to end]
]
parse line select rules item
]
; __________example_______________________________________________________
log-line: {L 09/22/2000 - 15:49:22: "<Usyd> H4XX0R<124><WON:20007739>" connected, ip
160.34.64.112
}
search log-line "user"