AltME groups: search

Help · search scripts · search articles · search mailing list

results summary

world	hits
r4wp	5907
r3wp	58701
total:	64608

results window for this page: [start: 17301 end: 17400]

world-name: r3wp

Group: Parse ... Discussion of PARSE dialect [web-public]
Gabriele: 25-Dec-2006	not a bug - you are not skipping the newline, so to "^/" will always match. you are not getting to the end.
BrianH: 27-Dec-2006	Nevermind, failing isn't a problem here.
Maxim: 28-Dec-2006	hi, yesterday I realized I have a 1400 line single parse ruleset which amounts to ~40kb of code ! :-) I was wondering what are your largest Parse rulesets, I'm just curious at how people here are pushing REBOL into extremes. I might also say that parse is wildly efficient in this case, allowing my server to decipher 600 bytes of binary data through all of this huge parse rule within 0.01 - 0.02 seconds (spitting out a nested rebol block tree of rebxml2xml ready data).
Maxim: 28-Dec-2006	like bounds checking, making sure some items are not within a specific area, etc.
Geomol: 28-Dec-2006	My largest Parse rulesets are in NicomDoc. The scripts nicomdoc.r and ndmath.r parse from text to rebxml format. They are 20k and 24k. ndrebxml2html.r parse from NicomDoc rebxml format til html, and that is a 28k script mostly parse rules. I once build a html dialect, and that was 24k.
Geomol: 28-Dec-2006	And yes, parse is a great tool!
Tomc: 28-Dec-2006	Max complement charset ... can often be used as a sort of NOT
Maxim: 28-Dec-2006	true, but that does not match many cases. obviously one can built NOT-A NOT-B and so one.. but man.. that gets tedious, which is not what parse should be.
Robert: 29-Dec-2006	And don't forget to take a look at Gab's compile-rule function.
Ladislav: 29-Dec-2006	lad, maybe, but if you change the name of the variable to copy to you have then to change it twice in the rule. - right. That is a general problem of procedural programming style. OTOH, the "opt skip" variant has got another problem - the "opt skip" code is related only to the first alternative, which seems to me like the reason why Joe doesn't like it
Ladislav: 29-Dec-2006	have a look at: http://www.compkarori.com/vanilla/display/compile-rules.r my contribution is at: http://www.compkarori.com/vanilla/display/TO%2C+THRU+And+NOT+PARSE+Rules
Graham: 29-Dec-2006	Me too .. but it must be a vanilla problem. Are you logged in when you read the page, or as guest?
Oldes: 19-Jan-2007	Isn't this a bug? >> b: "1234^@567" parse/all b [copy i to {^@} 1 skip b: to end] probe i probe b 1234 567 BUT: >> b: "1234^@567" parse/all b [copy i to #{00} 1 skip b: to end] probe i probe b 1234 1234^@567
Maxim: 19-Jan-2007	I wish it did too. it would make some things simple a little bit.
Volker: 19-Jan-2007	notperfect, but a way to use numbers
Oldes: 28-Feb-2007	how to parse such a string: {some.string/(a + (b.a * c()))/end} to get: ["some" "string" "(a + (b.a * c()))" "end"]
Maxim: 28-Feb-2007	here is a full script :-) rebol [] paren-start: charset "(" paren-end: charset ")" parens: union paren-start paren-end separator: charset [#"/" #"."] label: complement union separator parens content: complement parens blk: copy [] str: {some.string/(a + 1 / 2 (b.a * c()))/end} expression: [paren-start any [content \| expression] paren-end] parse/all str [some [ separator \| here: some [label] there: (append blk copy/part here there) \| here: expression there: (append blk copy/part here there)]] probe blk ask "..."
Steeve: 7-Mar-2007	I made a pause during which I raised sheep in the French Alps
Steeve: 7-Mar-2007	the reality ?, i'm just a nerd, i must admit this weakness, i can't leave without technology
Maxim: 7-Mar-2007	hehe not enough buttons on a sheep I guess ;-)
Sunanda: 8-Mar-2007	I've never heard of such a script, Steeve. It does not seem to be on REBOLtech (a forerunner to REBOL.org). You could try some more detailed searches than I did if you want to look further http://www.reboltech.com/library/scripts/ **** Sadly, a lot of good stuff gets published on personal websites, and when the enthusiasm for REBOL wanes, or the site is taken offline for some reason, the scripts are lost to the wider community.
sqlab: 8-Mar-2007	Steeve, you can have a look at the scripts archive http://www.rebol.org/cgi-bin/cgiwrap/rebol/ml-display-thread.r?m=rmlYCHQ
Maxim: 13-Apr-2007	I am having a hard time with using REMOVE on a parsed string.
Maxim: 13-Apr-2007	symbol: charset [#"a" - #"z" #"A" - #"Z" #"0" #"9" "_-?!*+~"] nstr: "...aa.....a.....a.....h." parse/all nstr [ some [ symbol \| end skip \| [ here: ( probe here either none? here/1 [print nstr print "!!" ][print here/1 print nstr remove here here: back here]) ; :here skip ] ;:here ] ] probe nstr
Maxim: 13-Apr-2007	doh.... was about to give a better example... then I realise the error... there is nothing to match in the last rule, just an expression, so a no match is always matching nothing !
btiffin: 13-Apr-2007	It's nice just thinking out loud once in a while...we're here for you Maxim. Cerebration. :)
Maxim: 13-Apr-2007	well, I was trying to say that I had not realized this was possible... and its quite cool... we can actually use that in some ways ... make rules which make parse become an event handler for example ! the moment you feed a string some value, parse will start treating it...
Maxim: 13-Apr-2007	and then fall back to silence... (just inserting a little wait in the loop will take care of cpu load)
Maxim: 13-Apr-2007	it would be nice if the result from the expression could be used to determine if the rule is a match or not...
btiffin: 13-Apr-2007	Off topic but...that was what intrigued me with SNOBOL and Icon...succeed, fail and a result.
btiffin: 13-Apr-2007	If you haven't, take a read through Icon pattern matching...mondo powerful. Off topic...sorry.
Maxim: 13-Apr-2007	here is the solution... complement the valid symbols and match them explicitely. rebol [] symbol: charset [#"a" - #"z" #"A" - #"Z" #"0" #"9" "_-?!*+~"] invalid: complement symbol nstr: "...aa.....a.....a.....h." end-rule: [] parse/all nstr [ some [ symbol \| [here: invalid (remove here) ] :here ] ]
btiffin: 13-Apr-2007	More off topic...I wept a little bit when I heard of Dr. Ralph Griswold passing, back in October. Never met him, much respect.
btiffin: 13-Apr-2007	Final off topic; Now I'm slowly replacing all my computer heroes...Names like Kernighan, Pike, Moore, Griswold, Lovelace... are now Sassenrath, DocKimbel, Anton, Cyphre, Graham, Maxim, Ladislav, Henrik, Oldes...et al. Thanks guys. You are making my world a better place.
Ladislav: 13-Apr-2007	Max: "it would be nice if the result from the expression could be used to determine if the rule is a match or not" - that is of course possible as follows:
Ladislav: 13-Apr-2007	right, but the value of the expression can be used to determine if a rule is a match
Ladislav: 13-Apr-2007	otherwise, I am for addition of a rule which would take the result of the paren! expression directly into account without us having to resort to this (more complicated) way
Ladislav: 13-Apr-2007	if you use a more appropriate rule name like check-result, you have got a more readable:
btiffin: 13-Apr-2007	guru question; Will a utype! definition be allowed to wrap builtins? SNMP MIBs require a fairly heavy weight tuple! But will a short MIB conflict with internal scans of tuple! or do utype! scans take some form of precedent? I've become curious, yet remain dumb enough to not know.
Oldes: 14-Apr-2007	Isn't this a bug? >> parse [a/b] [a/b] Script Error: a has no value Near: parse [a/b] [a/b]
Oldes: 14-Apr-2007	I don't want the a to be evaluated in the parse rules!
Oldes: 14-Apr-2007	hm... ech.. I'm stupid.. normaly is evaluated as well, so it's not a bug.. but is there any way how to parse specific path! ?
Oldes: 14-Apr-2007	I mean: >> parse [a] ['a] == true >> parse [a/b] ['a/b] == false
ChristianE: 14-Apr-2007	>> parse [a/b] [(path: 'a/b) path] == true >> parse [a/c] [(path: 'a/b) path] == false
Gabriele: 14-Apr-2007	older versions did not evaluate paths. since newer version do, we need 'a/b to work. dunno if this is in RAMBO... but it needs to be fixed.
Oldes: 14-Apr-2007	Yes I know they were not evaluated before, but I'm not sure if it's not a feature, that they are evaluated now.
Oldes: 14-Apr-2007	I just think, that maybe it would be good to have parse [a/b] ['a/b] == true as is parse [a] ['a]
Oldes: 14-Apr-2007	..because it would not be useful anyway as I would have to write a special rule for each refinement.
Gabriele: 14-Apr-2007	it's not a bug that they are evaluated (in fact it was requested in a rambo ticket). it's a bug that - since now they are evaluated - lit-paths are not used to match paths.
Anton: 14-Apr-2007	Maybe if the result of parens were parsed, we could use a paren to evaluate a path (and don't use a paren to leave as is).
Gabriele: 16-Apr-2007	it looks like that 3.0 won't have a new parse, but i don't have any details and i'm just guessing.
PeterWood: 16-Apr-2007	Does that imply there won't be a Unicode Charset with which to parse unicode strings?
btiffin: 16-Apr-2007	There is going to be a unicode! datatype
Henrik: 17-Apr-2007	Perhaps vector! will play a part in solving the unicode problem
Gabriele: 17-Apr-2007	you can make a bitset with 65000 bits in r2... so why not in r3?
Pekr: 17-Apr-2007	I don't know, as for me, I just wanted to\|thru [a \| b \| c] :-)
Gabriele: 17-Apr-2007	we won't stop at 3.0... there will be a 3.1 and so on... at least we hope so :)
Rebolek: 24-May-2007	Is there some way to make this work: parse "aaa" [some "a" "a"] or PARSE just don't work this way?
Geomol: 24-May-2007	What do you mean? >> parse "aaa" [some "a"] == true Why the second "a"?
Geomol: 24-May-2007	Parsing for [some "a" "a"] will return false, because you've already parsed past the "a"s.
Geomol: 24-May-2007	A clumsy way of doing it: >> parse "aaa" [some "a" p: (p: skip p -1) :p "a"] == true
BrianH: 24-May-2007	parse "aaa" [some [p: "a"] :p "a"]
BrianH: 24-May-2007	Not in my version. The p is set before the position advances past the "a", so it is already back.
BrianH: 24-May-2007	The p is reset before "a" is consumed - that is why I put [p: "a"] in [].
BrianH: 24-May-2007	Interesting. It seems to be setting the last p before it fails on the last iteration of "a".
BrianH: 24-May-2007	Clearly I need a temporary.
BrianH: 24-May-2007	parse "aaa" [some [p1: "a" (p2: :p1)] :p2 "a"]
BrianH: 24-May-2007	A temporary will work better with parts of unknown size, and be faster too.
BrianH: 24-May-2007	Still, you might want to apply rewrite rules to your generated parse rules - that code seems a little sloppy.
Oldes: 24-May-2007	that you will not have [some "a" "a"] but just [some "a"]
BrianH: 24-May-2007	By rewrite rules, I mean something like what Gabriele came up with for the rebcode assembler a while ago. Since I helped refine his work, I may still have a copy somewhere. I'll take a look.
Geomol: 24-May-2007	Define readable! ;-) Maybe you could use a combination of to-string, to-binary, debase and things like that.
Rebolek: 24-May-2007	if i do (a: charset "abc") i want to do also (decharset a) to get "abc" :) that's readable ;)
Geomol: 24-May-2007	Rebolek, use my hokus-pokus function: hokus-pokus: func [ value /local a out ][ either bitset? value [ a: enbase/base to-binary value 2 out: copy "" forall a [ if a/1 = #"1" [append out to-char (index? a) - 4] ] out ][ 42 ] ] >> a: charset "abc" >> hokus-pokus a == "abc"
Gregg: 24-May-2007	Yes, Brett has built a lot of very cool stuff. Haven't seen him around for a while though.
Oldes: 26-May-2007	and... it would be good to have just a function which returns the translated Rebol parse block
Rebolek: 26-May-2007	And yes, function returning just parse rules will be done, this is just a work in progress
Oldes: 26-May-2007	and anyway... 12 or 8 millions google rusults is not a big difference if your page is not listed between first 20 pages:)
Oldes: 26-May-2007	you can use... http://www.googlefight.com/or make a Rebol version... it's quite easy
Rebolek: 26-May-2007	in the file i posted is a function REGSET that converts small bit of regex to bitset, it's syntax seems to be easier than charset's syntax (charset [#"a" - #"z" #"0" - #"9"] vs regset "a-z0-9")
Gregg: 26-May-2007	Very nice Boleslav! What regex engine/syntax are you going for compatibility with (if any)? Charset syntax is probably that way because it's a dialect, and Carl wanted a string as input to be easy, without escapes and such; just my guess.
BrianH: 26-May-2007	You should wrap your code in a context.
BrianH: 26-May-2007	You should seperate the regex compilation phase from its application phase, and just write a wrapper that calls both in order. The compilation phase is often more complex than just applying the results, so if you are using the regex repeatedly you should just compile it once.
Rebolek: 26-May-2007	Oldes, I though about just a translator from regex to parse rules and I'm not sure it will be easy, I'm using my 'tail-parse that matches rules in reversed order that is better for regex syntax. Maybe there's some other way.
Rebolek: 26-May-2007	this is the problem with [some "a" "a"]. This is equivalent of "a*a" in regex which is perfectly valid, but problematic in parse. This is simple example, but it can get quite complicated so I'm not sure I can handle all cases. The reversed order seemed simpler. But you will probably prove me wrong :)
BrianH: 26-May-2007	BTW, "a*a" is directly equivalent to [any "a" "a"], not some.
BrianH: 26-May-2007	Most of the changes were made to make it faster and to use less memory overhead. - It is faster for parse to match a one-character string than a character value. - Insert is faster than union, and makes no temporaries. - If you are capturing a single character, I think [a: skip (a: first a)] is faster than [copy a skip (a: first a)]. - Path access is slower than the equivalent native, so [first a] instead of [a/1]. - The fastest loop is loop, even with the math to calculate the number of times.
BrianH: 26-May-2007	Aside from the one-time bind, repeat may be faster than loop with a self-incremented index.
BrianH: 26-May-2007	It might be a good idea to run a peephole optimizer on the patterns before compiling them, to convert ones like "aa" to "aa".
Rebolek: 27-May-2007	Hi Brian, thanks for support, I was out for a sleep :)
BrianH: 27-May-2007	Yeah, so it does. I wonder why the docs don't say (will be local) like it does for foreach. It still ends up faster than loop when you have to keep track of an index or a counter.
Dockimbel: 27-May-2007	Brian, you've stated that "It is faster for parse to match a one-character string than a character value." It seems to me that the opposite statement is true. (matching a char! is faster than matching a on-character string!)
BrianH: 27-May-2007	It seems to me that the opposite _should_ be true, but parse converts the character to a string before matching it - no conversion is performed for string values. It's just one of those weird things.
Ladislav: 28-May-2007	my measurements show: >> time-block [parse "a" ["a"]] 0.05 == 3.83615493774414E-7 >> time-block [parse "a" [#"a"]] 0.05 == 3.61204147338867E-7 , i.e. the opposite
BrianH: 28-May-2007	Which version? Nevermind, my timing differences may just be a multitasking artifact.
BrianH: 28-May-2007	Too small a sample for a busy computer.
BrianH: 28-May-2007	Rebolek, I gather you made the parse go in reverse to handle rules like "a+a" better. How does your reverse code handle "aa+", or "aa+a" - same problem?
Dockimbel: 28-May-2007	Here's another benchmark: >> data: head insert/dup make string! 10'000'000 #"a" 10'000'000 >> t0: now/time/precise loop 10 [parse data [some "a"]] now/time/precise - t0 == 0:00:06.078 >> t0: now/time/precise loop 10 [parse data [some #"a"]] now/time/precise - t0 == 0:00:04.296 Running this test several times shows that char! matching is, in average, 30 % faster than string! matching.
BrianH: 28-May-2007	Well there you go. That's different numbers than last time, but more dramatic. It's just a #, easy fix :)
Dockimbel: 28-May-2007	Didn't want to sound "dramatic", but just wanted to provide a more accurate measure. Sure whatever datatype is used (char! or string!) in regex.r, that won't change much the overall speed. ;-)

17301 / 64608

[174]