AltME groups: search
Help · search scripts · search articles · search mailing listresults summary
world | hits |
r4wp | 5907 |
r3wp | 58701 |
total: | 64608 |
results window for this page: [start: 17301 end: 17400]
world-name: r3wp
Group: Parse ... Discussion of PARSE dialect [web-public] | ||
Gabriele: 25-Dec-2006 | not a bug - you are not skipping the newline, so to "^/" will always match. you are not getting to the end. | |
BrianH: 27-Dec-2006 | Nevermind, failing isn't a problem here. | |
Maxim: 28-Dec-2006 | hi, yesterday I realized I have a 1400 line single parse ruleset which amounts to ~40kb of code ! :-) I was wondering what are your largest Parse rulesets, I'm just curious at how people here are pushing REBOL into extremes. I might also say that parse is wildly efficient in this case, allowing my server to decipher 600 bytes of binary data through all of this huge parse rule within 0.01 - 0.02 seconds (spitting out a nested rebol block tree of rebxml2xml ready data). | |
Maxim: 28-Dec-2006 | like bounds checking, making sure some items are not within a specific area, etc. | |
Geomol: 28-Dec-2006 | My largest Parse rulesets are in NicomDoc. The scripts nicomdoc.r and ndmath.r parse from text to rebxml format. They are 20k and 24k. ndrebxml2html.r parse from NicomDoc rebxml format til html, and that is a 28k script mostly parse rules. I once build a html dialect, and that was 24k. | |
Geomol: 28-Dec-2006 | And yes, parse is a great tool! | |
Tomc: 28-Dec-2006 | Max complement charset ... can often be used as a sort of NOT | |
Maxim: 28-Dec-2006 | true, but that does not match many cases. obviously one can built NOT-A NOT-B and so one.. but man.. that gets tedious, which is not what parse should be. | |
Robert: 29-Dec-2006 | And don't forget to take a look at Gab's compile-rule function. | |
Ladislav: 29-Dec-2006 | lad, maybe, but if you change the name of the variable to copy to you have then to change it twice in the rule. - right. That is a general problem of procedural programming style. OTOH, the "opt skip" variant has got another problem - the "opt skip" code is related only to the first alternative, which seems to me like the reason why Joe doesn't like it | |
Ladislav: 29-Dec-2006 | have a look at: http://www.compkarori.com/vanilla/display/compile-rules.r my contribution is at: http://www.compkarori.com/vanilla/display/TO%2C+THRU+And+NOT+PARSE+Rules | |
Graham: 29-Dec-2006 | Me too .. but it must be a vanilla problem. Are you logged in when you read the page, or as guest? | |
Oldes: 19-Jan-2007 | Isn't this a bug? >> b: "1234^@567" parse/all b [copy i to {^@} 1 skip b: to end] probe i probe b 1234 567 BUT: >> b: "1234^@567" parse/all b [copy i to #{00} 1 skip b: to end] probe i probe b 1234 1234^@567 | |
Maxim: 19-Jan-2007 | I wish it did too. it would make some things simple a little bit. | |
Volker: 19-Jan-2007 | notperfect, but a way to use numbers | |
Oldes: 28-Feb-2007 | how to parse such a string: {some.string/(a + (b.a * c()))/end} to get: ["some" "string" "(a + (b.a * c()))" "end"] | |
Maxim: 28-Feb-2007 | here is a full script :-) rebol [] paren-start: charset "(" paren-end: charset ")" parens: union paren-start paren-end separator: charset [#"/" #"."] label: complement union separator parens content: complement parens blk: copy [] str: {some.string/(a + 1 / 2 (b.a * c()))/end} expression: [paren-start any [content | expression] paren-end] parse/all str [some [ separator | here: some [label] there: (append blk copy/part here there) | here: expression there: (append blk copy/part here there)]] probe blk ask "..." | |
Steeve: 7-Mar-2007 | I made a pause during which I raised sheep in the French Alps | |
Steeve: 7-Mar-2007 | the reality ?, i'm just a nerd, i must admit this weakness, i can't leave without technology | |
Maxim: 7-Mar-2007 | hehe not enough buttons on a sheep I guess ;-) | |
Sunanda: 8-Mar-2007 | I've never heard of such a script, Steeve. It does not seem to be on REBOLtech (a forerunner to REBOL.org). You could try some more detailed searches than I did if you want to look further http://www.reboltech.com/library/scripts/ **** Sadly, a lot of good stuff gets published on personal websites, and when the enthusiasm for REBOL wanes, or the site is taken offline for some reason, the scripts are lost to the wider community. | |
sqlab: 8-Mar-2007 | Steeve, you can have a look at the scripts archive http://www.rebol.org/cgi-bin/cgiwrap/rebol/ml-display-thread.r?m=rmlYCHQ | |
Maxim: 13-Apr-2007 | I am having a hard time with using REMOVE on a parsed string. | |
Maxim: 13-Apr-2007 | symbol: charset [#"a" - #"z" #"A" - #"Z" #"0" #"9" "_-?!*+~"] nstr: "...aa.....a.....a.....h." parse/all nstr [ some [ symbol | end skip | [ here: ( probe here either none? here/1 [print nstr print "!!" ][print here/1 print nstr remove here here: back here]) ; :here skip ] ;:here ] ] probe nstr | |
Maxim: 13-Apr-2007 | doh.... was about to give a better example... then I realise the error... there is nothing to match in the last rule, just an expression, so a no match is always matching nothing ! | |
btiffin: 13-Apr-2007 | It's nice just thinking out loud once in a while...we're here for you Maxim. Cerebration. :) | |
Maxim: 13-Apr-2007 | well, I was trying to say that I had not realized this was possible... and its quite cool... we can actually use that in some ways ... make rules which make parse become an event handler for example ! the moment you feed a string some value, parse will start treating it... | |
Maxim: 13-Apr-2007 | and then fall back to silence... (just inserting a little wait in the loop will take care of cpu load) | |
Maxim: 13-Apr-2007 | it would be nice if the result from the expression could be used to determine if the rule is a match or not... | |
btiffin: 13-Apr-2007 | Off topic but...that was what intrigued me with SNOBOL and Icon...succeed, fail and a result. | |
btiffin: 13-Apr-2007 | If you haven't, take a read through Icon pattern matching...mondo powerful. Off topic...sorry. | |
Maxim: 13-Apr-2007 | here is the solution... complement the valid symbols and match them explicitely. rebol [] symbol: charset [#"a" - #"z" #"A" - #"Z" #"0" #"9" "_-?!*+~"] invalid: complement symbol nstr: "...aa.....a.....a.....h." end-rule: [] parse/all nstr [ some [ symbol | [here: invalid (remove here) ] :here ] ] | |
btiffin: 13-Apr-2007 | More off topic...I wept a little bit when I heard of Dr. Ralph Griswold passing, back in October. Never met him, much respect. | |
btiffin: 13-Apr-2007 | Final off topic; Now I'm slowly replacing all my computer heroes...Names like Kernighan, Pike, Moore, Griswold, Lovelace... are now Sassenrath, DocKimbel, Anton, Cyphre, Graham, Maxim, Ladislav, Henrik, Oldes...et al. Thanks guys. You are making my world a better place. | |
Ladislav: 13-Apr-2007 | Max: "it would be nice if the result from the expression could be used to determine if the rule is a match or not" - that is of course possible as follows: | |
Ladislav: 13-Apr-2007 | right, but the value of the expression *can be used* to determine if a rule is a match | |
Ladislav: 13-Apr-2007 | otherwise, I am for addition of a rule which would take the result of the paren! expression directly into account without us having to resort to this (more complicated) way | |
Ladislav: 13-Apr-2007 | if you use a more appropriate rule name like check-result, you have got a more readable: | |
btiffin: 13-Apr-2007 | guru question; Will a utype! definition be allowed to wrap builtins? SNMP MIBs require a fairly heavy weight tuple! But will a short MIB conflict with internal scans of tuple! or do utype! scans take some form of precedent? I've become curious, yet remain dumb enough to not know. | |
Oldes: 14-Apr-2007 | Isn't this a bug? >> parse [a/b] [a/b] ** Script Error: a has no value ** Near: parse [a/b] [a/b] | |
Oldes: 14-Apr-2007 | I don't want the a to be evaluated in the parse rules! | |
Oldes: 14-Apr-2007 | hm... ech.. I'm stupid.. normaly is evaluated as well, so it's not a bug.. but is there any way how to parse specific path! ? | |
Oldes: 14-Apr-2007 | I mean: >> parse [a] ['a] == true >> parse [a/b] ['a/b] == false | |
ChristianE: 14-Apr-2007 | >> parse [a/b] [(path: 'a/b) path] == true >> parse [a/c] [(path: 'a/b) path] == false | |
Gabriele: 14-Apr-2007 | older versions did not evaluate paths. since newer version do, we need 'a/b to work. dunno if this is in RAMBO... but it needs to be fixed. | |
Oldes: 14-Apr-2007 | Yes I know they were not evaluated before, but I'm not sure if it's not a feature, that they are evaluated now. | |
Oldes: 14-Apr-2007 | I just think, that maybe it would be good to have parse [a/b] ['a/b] == true as is parse [a] ['a] | |
Oldes: 14-Apr-2007 | ..because it would not be useful anyway as I would have to write a special rule for each refinement. | |
Gabriele: 14-Apr-2007 | it's not a bug that they are evaluated (in fact it was requested in a rambo ticket). it's a bug that - since now they are evaluated - lit-paths are not used to match paths. | |
Anton: 14-Apr-2007 | Maybe if the result of parens were parsed, we could use a paren to evaluate a path (and don't use a paren to leave as is). | |
Gabriele: 16-Apr-2007 | it looks like that 3.0 won't have a new parse, but i don't have any details and i'm just guessing. | |
PeterWood: 16-Apr-2007 | Does that imply there won't be a Unicode Charset with which to parse unicode strings? | |
btiffin: 16-Apr-2007 | There is going to be a unicode! datatype | |
Henrik: 17-Apr-2007 | Perhaps vector! will play a part in solving the unicode problem | |
Gabriele: 17-Apr-2007 | you can make a bitset with 65000 bits in r2... so why not in r3? | |
Pekr: 17-Apr-2007 | I don't know, as for me, I just wanted to|thru [a | b | c] :-) | |
Gabriele: 17-Apr-2007 | we won't stop at 3.0... there will be a 3.1 and so on... at least we hope so :) | |
Rebolek: 24-May-2007 | Is there some way to make this work: parse "aaa" [some "a" "a"] or PARSE just don't work this way? | |
Geomol: 24-May-2007 | What do you mean? >> parse "aaa" [some "a"] == true Why the second "a"? | |
Geomol: 24-May-2007 | Parsing for [some "a" "a"] will return false, because you've already parsed past the "a"s. | |
Geomol: 24-May-2007 | A clumsy way of doing it: >> parse "aaa" [some "a" p: (p: skip p -1) :p "a"] == true | |
BrianH: 24-May-2007 | parse "aaa" [some [p: "a"] :p "a"] | |
BrianH: 24-May-2007 | Not in my version. The p is set before the position advances past the "a", so it is already back. | |
BrianH: 24-May-2007 | The p is reset before "a" is consumed - that is why I put [p: "a"] in []. | |
BrianH: 24-May-2007 | Interesting. It seems to be setting the last p before it fails on the last iteration of "a". | |
BrianH: 24-May-2007 | Clearly I need a temporary. | |
BrianH: 24-May-2007 | parse "aaa" [some [p1: "a" (p2: :p1)] :p2 "a"] | |
BrianH: 24-May-2007 | A temporary will work better with parts of unknown size, and be faster too. | |
BrianH: 24-May-2007 | Still, you might want to apply rewrite rules to your generated parse rules - that code seems a little sloppy. | |
Oldes: 24-May-2007 | that you will not have [some "a" "a"] but just [some "a"] | |
BrianH: 24-May-2007 | By rewrite rules, I mean something like what Gabriele came up with for the rebcode assembler a while ago. Since I helped refine his work, I may still have a copy somewhere. I'll take a look. | |
Geomol: 24-May-2007 | Define readable! ;-) Maybe you could use a combination of to-string, to-binary, debase and things like that. | |
Rebolek: 24-May-2007 | if i do (a: charset "abc") i want to do also (decharset a) to get "abc" :) that's readable ;) | |
Geomol: 24-May-2007 | Rebolek, use my hokus-pokus function: hokus-pokus: func [ value /local a out ][ either bitset? value [ a: enbase/base to-binary value 2 out: copy "" forall a [ if a/1 = #"1" [append out to-char (index? a) - 4] ] out ][ 42 ] ] >> a: charset "abc" >> hokus-pokus a == "abc" | |
Gregg: 24-May-2007 | Yes, Brett has built a lot of very cool stuff. Haven't seen him around for a while though. | |
Oldes: 26-May-2007 | and... it would be good to have just a function which returns the translated Rebol parse block | |
Rebolek: 26-May-2007 | And yes, function returning just parse rules will be done, this is just a work in progress | |
Oldes: 26-May-2007 | and anyway... 12 or 8 millions google rusults is not a big difference if your page is not listed between first 20 pages:) | |
Oldes: 26-May-2007 | you can use... http://www.googlefight.com/or make a Rebol version... it's quite easy | |
Rebolek: 26-May-2007 | in the file i posted is a function REGSET that converts small bit of regex to bitset, it's syntax seems to be easier than charset's syntax (charset [#"a" - #"z" #"0" - #"9"] vs regset "a-z0-9") | |
Gregg: 26-May-2007 | Very nice Boleslav! What regex engine/syntax are you going for compatibility with (if any)? Charset syntax is probably that way because it's a dialect, and Carl wanted a string as input to be easy, without escapes and such; just my guess. | |
BrianH: 26-May-2007 | You should wrap your code in a context. | |
BrianH: 26-May-2007 | You should seperate the regex compilation phase from its application phase, and just write a wrapper that calls both in order. The compilation phase is often more complex than just applying the results, so if you are using the regex repeatedly you should just compile it once. | |
Rebolek: 26-May-2007 | Oldes, I though about just a translator from regex to parse rules and I'm not sure it will be easy, I'm using my 'tail-parse that matches rules in reversed order that is better for regex syntax. Maybe there's some other way. | |
Rebolek: 26-May-2007 | this is the problem with [some "a" "a"]. This is equivalent of "a*a" in regex which is perfectly valid, but problematic in parse. This is simple example, but it can get quite complicated so I'm not sure I can handle all cases. The reversed order seemed simpler. But you will probably prove me wrong :) | |
BrianH: 26-May-2007 | BTW, "a*a" is directly equivalent to [any "a" "a"], not some. | |
BrianH: 26-May-2007 | Most of the changes were made to make it faster and to use less memory overhead. - It is faster for parse to match a one-character string than a character value. - Insert is faster than union, and makes no temporaries. - If you are capturing a single character, I think [a: skip (a: first a)] is faster than [copy a skip (a: first a)]. - Path access is slower than the equivalent native, so [first a] instead of [a/1]. - The fastest loop is loop, even with the math to calculate the number of times. | |
BrianH: 26-May-2007 | Aside from the one-time bind, repeat may be faster than loop with a self-incremented index. | |
BrianH: 26-May-2007 | It might be a good idea to run a peephole optimizer on the patterns before compiling them, to convert ones like "a*a" to "aa*". | |
Rebolek: 27-May-2007 | Hi Brian, thanks for support, I was out for a sleep :) | |
BrianH: 27-May-2007 | Yeah, so it does. I wonder why the docs don't say (will be local) like it does for foreach. It still ends up faster than loop when you have to keep track of an index or a counter. | |
Dockimbel: 27-May-2007 | Brian, you've stated that "It is faster for parse to match a one-character string than a character value." It seems to me that the opposite statement is true. (matching a char! is faster than matching a on-character string!) | |
BrianH: 27-May-2007 | It seems to me that the opposite _should_ be true, but parse converts the character to a string before matching it - no conversion is performed for string values. It's just one of those weird things. | |
Ladislav: 28-May-2007 | my measurements show: >> time-block [parse "a" ["a"]] 0.05 == 3.83615493774414E-7 >> time-block [parse "a" [#"a"]] 0.05 == 3.61204147338867E-7 , i.e. the opposite | |
BrianH: 28-May-2007 | Which version? Nevermind, my timing differences may just be a multitasking artifact. | |
BrianH: 28-May-2007 | Too small a sample for a busy computer. | |
BrianH: 28-May-2007 | Rebolek, I gather you made the parse go in reverse to handle rules like "a+a" better. How does your reverse code handle "aa+", or "aa+a" - same problem? | |
Dockimbel: 28-May-2007 | Here's another benchmark: >> data: head insert/dup make string! 10'000'000 #"a" 10'000'000 >> t0: now/time/precise loop 10 [parse data [some "a"]] now/time/precise - t0 == 0:00:06.078 >> t0: now/time/precise loop 10 [parse data [some #"a"]] now/time/precise - t0 == 0:00:04.296 Running this test several times shows that char! matching is, in average, 30 % faster than string! matching. | |
BrianH: 28-May-2007 | Well there you go. That's different numbers than last time, but more dramatic. It's just a #, easy fix :) | |
Dockimbel: 28-May-2007 | Didn't want to sound "dramatic", but just wanted to provide a more accurate measure. Sure whatever datatype is used (char! or string!) in regex.r, that won't change much the overall speed. ;-) |
17301 / 64608 | 1 | 2 | 3 | 4 | 5 | ... | 172 | 173 | [174] | 175 | 176 | ... | 643 | 644 | 645 | 646 | 647 |