AltME groups: search

Help · search scripts · search articles · search mailing list

results summary

world	hits
r4wp	4382
r3wp	44224
total:	48606

results window for this page: [start: 23101 end: 23200]

world-name: r3wp

Group: Parse ... Discussion of PARSE dialect [web-public]
Steeve: 3-Oct-2009	And you all missed my (N Fail) proposal.
Steeve: 3-Oct-2009	I just rewrote the math expressions resolver. digit: charset "0123456789" num: [some digit opt [#"." any digit]] term: [num \| #"(" any lv1 term #")" \| #"-" any lv3 term] calc: [ remove [copy num1 term copy op skip copy num2 term] (expr: do reform select [ "+" [num1 op num2] "-" [num1 op num2] "" [num1 op num2] "/" [num1 op num2] "^^" [num1 "" num2] "%" [num1 "//" num2] ] op) stay insert expr (probe e) ] lv4: [term #"%" term then fail \| break \| calc] lv3: [any lv4 term #"^^" any lv4 term then fail \| break \| calc] lv2: [any lv3 term [#"" \| #"/"] any lv3 term then fail \| break \| calc] lv1: [any lv2 term [#"+" \| #"-"] any lv2 term then fail \| break \| calc] I just think it's more clear like that. Moreover, it's prepared to use the further AND command. Because this nasty trick i use: [rule THEN FAIL \| BREAK \| calc] will be replaced by: [AND rule calc]
Pekr: 4-Oct-2009	What is your take on simple mode parsing? It is handy for simple CSV parsing, and the idiom is common: parse/all row ";" The trouble is, that if there is no data in last column, parse mistakenly makes the resulting block shorter, so you have to use common idiom: rec: parse/all append row ";" ";" I always wondered, if it could be regarded being a parse bug?
Henrik: 4-Oct-2009	enline and deline will help somewhat.
Pekr: 4-Oct-2009	Ladislav - in comment to ticket #1248, you write: According to the documentation, that can be found in http://www.rebol.net/wiki/Parse_Project parse "b" [not #"a"] yields FALSE correctly. If you want to obtain TRUE, you can try e.g.: parse "b" [not #"a" to end] My question is - what it the advantage to actually not advance the input on the rule match? It does not look natural and I would expect it to match the rule and hence move past it: >> parse "b" [not #"a" ??] end!: "b" == false ... as can be seen, it does not advance ...
Ladislav: 4-Oct-2009	What is the advantage?: 1) by not consuming input this would be a direct inversion of the rule. Example: parse ""a" [not end ...] is a meaningful rule, and it is quite trivial to see, that any rule consuming input would not be a direct inversion of this rule. NOT SOMETHING actually means, that at the current input position the SOMETHING rule shall not match. That does not give us any information, that NOT should skip any input (how far should it?). 2) This version of NOT is compatible with PEG 3) It is consistent with the AND operation: [AND rule] is equivalent to [NOT [NOT rule]]
Ladislav: 4-Oct-2009	Yet another example: [NOT skip] is equivalent to the [END] rule and is meaningful only, when NOT does not skip any input
Ladislav: 4-Oct-2009	...I would expect it to match the rule and hence move past it... - that is trivially wrong. If the RULE matches, the [NOT RULE] cannot match, therefore it cannot even advance. The only case, when (theoretically) we could think of advancing is, when the rule does not match. But then, it is not known, how far.
Maxim: 5-Oct-2009	pekr, I had the same initial reaction, then realized that it would not be consistent wrt fail or no fail... when NOT would succeed a match (and fail the rule), the input would be beyond what the not is usefull for. when I started thinking about it, if you really want you can simply use a set word/get word pair to advance when the not finds a match to ignore a rule, but then its like not using 'NOT in the first place, so its pointless :-)
Steeve: 6-Oct-2009	I can have a look, but the purpose of NOT is not to have better perfs than complemented charset, but to allow some simplification when writing rules. Actually, It's the case of most other improvements, easier to write, not inevitably faster. And don't forget that safe complemented charset in R3 are a pain in the ass to construct, because of UTF-8
PeterWood: 6-Oct-2009	Which is why I was dissapointed that I apparently misunderstood from Carl's blog: Changes that are critical, but not highly complicated. For example, providing a NOT command seems easy enough, and it is now critical because using complemented charsets is problematic (due to the Unicode enhancements).
Steeve: 6-Oct-2009	Well, i saw your script, i don't know if it can be faster, i only can say I would have written it differently. Probaby, using parse and load/next for all normal rebol values. I can see that your rule about matching binaries are false. Cause [#{" thru #"}"] is wrong (what if the the binary contains the #"^}" char ?)
BrianH: 6-Oct-2009	Peter, Steeve, the original problem that started the parse proposals was the problem of complimenting charsets. However, it quickly changed to improving PARSE in general. Then, while we were waiting for the parse proposals to come up on the todo list, we came up with a better solution to complimenting charsets, which is not yet implemented and which is not limited to PARSE.
BrianH: 6-Oct-2009	Using a bit in the charset that would mark it as "complemented", and then all of its matching algorithms would do an internal not.
BrianH: 6-Oct-2009	I want to write more port code first and refine the model based on what I learn.
BrianH: 12-Oct-2009	Behavior of BREAK, ANY and SOME decided, finally: http://www.rebol.net/r3blogs/0270.html
BrianH: 12-Oct-2009	And it's finally break from a loop, rather than break from a block (supposedly).
Maxim: 12-Oct-2009	but its a hell of a powerfull addition to parse and to general code control. I don't see why Carl can't see any use for it.
BrianH: 12-Oct-2009	And you can do that with CATCH.
Steeve: 12-Oct-2009	yep, and for functions, you still got THROW/CATCH and RETURN, which are enough to my mind.
BrianH: 12-Oct-2009	The BREAK, THROW, RETURN, EXIT, HALT and QUIT functions are implemented the same way, just with different error codes.
Maxim: 12-Oct-2009	but n BREAK allows us to leverage smaller rules reuse, as if they where large complex rules and still benefit from the same speed of a root rule backtrack.
BrianH: 12-Oct-2009	I think that Carl is trying to balance speed, ease of use, and debugability. In practice n BREAK would be tricky to debug, and doesn't actually reflect what PARSE does internally. Apparently PARSE isn't actually recursive descent - it just fakes it with a state machine.
BrianH: 12-Oct-2009	Because you can't through the end, not even with THRU END. And once you reach the end, END always succeeds.
BrianH: 12-Oct-2009	And TO "abc" will also continue to succeed, matching the same "abc" every time. THRU "abc" skips past the "abc" like you say.
Pekr: 13-Oct-2009	So according to his doc, we should get BREAK/RETURN and DO?
Pekr: 13-Oct-2009	But generally - the level of feedback is lower and lower. We need to get R3 into beta with requested features in few months, as we are starting to loose ppl being interested ...
Pekr: 13-Oct-2009	well, otoh we lived without OF for so long. I think it can be done in a conventional (recent) way :-) I think that Carl should dedicate few more days to finish parse and move on to Extensions :-)
BrianH: 13-Oct-2009	The only still-missing proposals that aren't easy or efficient to work around are OF and REVERSE. They will be missed if not included. Unfortunately, the same reasons why they will be hard to work arond if missing, are the reasons why they would be difficult to implement :(
Graham: 14-Oct-2009	Tim Berners-Lee is quoted today to say that he can't think of a good reason to keep the // in http://, and that if he did it again, he would have done without them. I wonder if he spoke to people who write parsers ....
Gabriele: 15-Oct-2009	the reason for the // is to allow relative paths like: //www.rebol.com/ where the scheme is the same as the base url. Nobody has ever used this; also, it could have been achieved by using :www.rebol.com/ instead... so, yeah, it was not really a good idea. I also don't think ftp:file.txt (meaning, change scheme, but keep host and path) has ever been used and not sure it's supported by software. so in practice http:www.rebol.com/ would have worked.
BrianH: 15-Oct-2009	It's an operator, like \|, and mentioned in that section near the top.
Pekr: 15-Oct-2009	isn't AND operator too for e.g.?
Maxim: 17-Oct-2009	I really want to do it... but I'm so deep into parsing right now I don't want to loose the few GB of information in my brain's cache. I'm writing self-modifying parse rules and its pretty nightmarish. although it works.
Pekr: 17-Oct-2009	An=And
Pekr: 17-Oct-2009	So - we don't need complementing to be enhanced? Because we talked about it, but it is not defined in proposal, it is not part of Carl's feature table, and I also got no reaction on R3 Chat ....
Maxim: 17-Oct-2009	laden with many paren expressions and a stack on top of it.
Maxim: 17-Oct-2009	since I use binding to map inner rules which are also constructed on the fly but have to be pushed and poped from the stack as I traverse data... its a lot of fun :-D
BrianH: 17-Oct-2009	If the self-modifying rules are strung-together basic blocks, you can use the rule compiler to generate the blocks. And the R3 changes make self-modifying rules less necessary, so you can have even larger basic blocks.
Maxim: 17-Oct-2009	and its not simple parsing since I use parsing index manipulation, which is also dictated by the source data in encounters. its like swatting flies using a fly swatter at the end of a rope, while riding a roller coster which changes layout every time you ride it ;-)
BrianH: 17-Oct-2009	Which is what a rule compiler does :) Actually, it sounds like you could adapt the tricks of the ruule compiler to your rule compiler, which would let you use the new operations in your rule source and have the workarounds generated in the output.
Maxim: 17-Oct-2009	well, build it and I will try it ;-)
Pekr: 18-Oct-2009	ah, got reply on Chat from Carl towards complementing: Re #5718: Pekr, that's a good question, and I think the answer must be YES. We need to be able to complement bitmaps in a nice way". Otherwise, Unicode bitmaps, even if simply used on ASCII chars, would take a lot of memory. This change should be listed on the project sheet, and if not, I'll add it there."
Chris: 22-Oct-2009	Both w1 and w+ appear to be very large values. Would it be smart to perhaps do: [[aw1 \| w1] any [aw+ \| w+]] Where 'aw1 and 'aw+ are limited to ascii values?
Steeve: 22-Oct-2009	Uses R3 (and his optimized complemented bitsets)
Chris: 22-Oct-2009	Allowing 'into to look inside strings can break current usage of 'into, requiring [and any-block! into ...]
Chris: 22-Oct-2009	An example: a nested d: [k v] structure where 'k is a word and 'v is 'd or any other type: data: [k [k "s"]] R2, you can validate with d: [word! [into d \| skip]] Now you have to specify: d: [word! [and any-block! into d \| skip]] otherwise you get an error if 'v is a string!
BrianH: 26-Oct-2009	Chris, there can be an advantage in R3 to breaking up a bitset into more that one bitset on occasion, mostly memory savings. However, it might not work as well as you might like since offset and/or sparse bitsets aren't supported. Bitsets that involve high codepoints will take a lot of RAM no matter what you do.
JoshF: 17-Nov-2009	The second one failed when I tried to extend the dialect with multiply (*) and divide (/). After further experimentation, it seems that you can't escape the "/". Google has not been helpful here... Does anybody have any ideas? I could parse for just a word! instead of the +, -, etc., but I wanted parse to do the work of deciding what was a valid operation or not. Sorry for the multiple messages, I'm still trying to figure this client out... Thanks for any advice!
JoshF: 17-Nov-2009	Both tdiv and lit-div type? to a word!...
Henrik: 17-Nov-2009	And also hence the expression "a block is or isn't loadable"
JoshF: 17-Nov-2009	OK... Mechanically, I see what you're saying, but what's the difference between a lit-word and a word? The spirit eludes me...
JoshF: 17-Nov-2009	I thought there was only word!'s and then everything else were more concrete types. I guess what I am asking is what is the purpose of lit-words?
JoshF: 17-Nov-2009	The difference between what I'm doing and what you linked to is that it's working against a string, while I'm doing a dialect, no?
Janko: 2-Dec-2009	I know I was stopped by parse in some occasions where. I think always every time the problem would be solvable if I had for example >> to [ "A" \| "B" ] where parser would check where is A and where is B and go to the closest one.
Janko: 2-Dec-2009	I was trying to show an example where you have two possible endings and you want to process both (and you can differently with parens) ) but you don't know in what order they will come or anything
Janko: 2-Dec-2009	yes , then you have to do charset parsing (but I don't know that yet :) ) .. I was just trying to say if there would be the way to say something like "to any [ "A" \| "B" ] and it would go to the closest one A LOT of problems with parse would be easily solvable
Graham: 2-Dec-2009	and see which has the best fit ?
Janko: 2-Dec-2009	The pattern is known ... the scentence starts with this is and can end with . or ! but they can come in any order .. if you try to parse with "." first you will get ---- ops some errors upthere .. just a sec
Janko: 2-Dec-2009	this is the common to all problems where that I am describing .. if I had > to [ "." \| "!" ] and parse would find both and go to the one that is closer it would be solved.
Graham: 2-Dec-2009	Janko, best thing to do is show us a string you can't parse ... and someone will show you how to do it.
Janko: 2-Dec-2009	I don't have real example right now :) I had them few times before and I also asked here about them and I solved with your help somehow
Janko: 2-Dec-2009	I just started talking about this as a general limitation of parse that I meed a lot of times and I suppose Paul could of meet it when trying to parse CSV
Gregg: 2-Dec-2009	It's not necessarily a PARSE limitation, but there are things we'd like PARSE to do that aren't always reasonable. :-) TO and THRU can work very well, but that doesn't mean they'll work for every situation. You may have to use rules where you check for your target value or just SKIP, marking locations in the input as you go.
Gregg: 2-Dec-2009	That said, if you know the format (e.g. WRT quotes and escapes), it can be done with PARSE. It just may not be a one-liner.
Janko: 2-Dec-2009	I know parsing csv can be messy ... at least at this high level I don't know how to do it with escapes and commas in etc
Janko: 2-Dec-2009	and I know everything has limitations ... this functionality OR with taking the first that appears would just in practice solve me many cases
Graham: 2-Dec-2009	you have to turn off parse's default delimiters and use bitsets
Ladislav: 2-Dec-2009	Janko: the only problem is, that you cannot use: C: [to [A \| B]] , where A and B are "general rules", but you can always write: C: [here: [A \| B] :here \| skip C] , which would do what you want
Oldes: 2-Dec-2009	And Janko... if you don't use charsets at all, I think you should give it a try. It's not so difficult. I think that if I can write parser to colorize PHP code, than you can parse everything.
Janko: 3-Dec-2009	Ladislav, thanks.. I didn't know you could set the position back with :here , that is interesting and probably expands what you can do with parse a lot.
Janko: 3-Dec-2009	yes, you are right .. if you can write partser for php then you can make anything with it. I always supposed parse with charsets is like low level step by one char in a looop and call "events" and change states , with which you can parse anything from xml to languages .. well but parse with charsets is still much more elegant
Janko: 3-Dec-2009	but it is a level less simple and nice to use than simple parse modes that's why the simple ones should be powerfull if possible too - you can't get a newbie impressed with charset parsing because he won't understand it probably.
Ladislav: 3-Dec-2009	Just to complete the list of possible equivalents to the C: [to [A \| B]] rule, here is a way how to do it in Rebol3 parse: C: [while [and [A \| B] break \| skip \| reject]] you can find other equivalent idioms at http://en.wikibooks.org/wiki/REBOL_Programming/Language_Features/Parse#Parse_idioms
Ladislav: 3-Dec-2009	It looks, that I could have used: C: [while [and [A \| B] accept \| skip \| reject]]
jack-ort: 11-Dec-2009	Help! Still struggling to understand parse. How could I replace any and all SINGLE occurrences of the single-quote character anywhere in a string (beginning, middle or end) with TWO single-quotes? But if there are already TWO single-quotes together, I want to leave them alone. TIA for any and all help for a newbie!
Maxim: 11-Dec-2009	easy, actually. you match double quotes first then fallback to single quotes, adding a new one and skiping one char... give me a minute I should get something working...
jack-ort: 11-Dec-2009	Thanks! I'm going to have to look @ this for awhile to understand why you even need to worry about the double-quote character. Much to learn.... Thanks Maxim and Steeve for the prompt replies!
Rebolek: 11-Dec-2009	Just curious, I tested both versions and Steeve's version is about 2times faster than Maxim's :)
Maxim: 11-Dec-2009	actually, having a paypal account linked with your login and a "donate" button would be really nice :-) right in the chat tool.
Maxim: 11-Dec-2009	I sure would use it... some people have helped save days of work with free code and insight.
Maxim: 12-Dec-2009	I just adopted a new notation standard for parse rules... the goal is to make rules a bit more verbose as to the type of each rule token... I find this reads well in any direction, since we encouter the "=" character when reading from left to right or right to left... and parse rules often have to be read from right to left. example: =terminal=: [ =quote= copy terminal to =quote= skip (print ["found terminal: " terminal]) ] on very large rules, and with the syntax highlighting in my editor making the "=" signs very distinct, I can instantly detect what parts of my rules are other rules or character patterns... it also helps out in the declarations... I see when blocks are intended to be used as rules quite instantly where ever they are in my code. in my current little parser, I find I can edit my rules almost twice as fast and loose MUCH less time scanning my blocks to find the rule tokens, and switching them around. wonder what you guys think about it...
Maxim: 12-Dec-2009	another example.... in this dense block of text, I can spot the =eol= (end of line) token instantly in both x and y dimensions of the rule paragraph: =line-comment=: [ =comment-symbol= [ [thru =eol= (print "comment to end of line")] \|[to end] ] (print "success") ]
Maxim: 12-Dec-2009	syntax highlighting colorizes words ... stuff is colorized... but user words aren't colorised and they all get mixed up between functions, variables and rules... and having colors which are two strong next to each other and in relative distribution ... cancels out.
Graham: 12-Dec-2009	so you could write a parser that reads your rules and colorises them ...
Maxim: 12-Dec-2009	I'm just trying to get a feel for what others think about the idea. and sharing a bit of a discovery at the same time, if it may help others. the goal isn't to be popular or convince others... and sorry, if my last line may have looked harsh, it wasn't. :-) I was just resuming your reaction plainly and relaunching the question to be sure others realize I want a few opinions.
PeterWood: 12-Dec-2009	any others care to comment? I'm afraid t looks very messy to me and reminded me of Perl for some reasion.
Gregg: 13-Dec-2009	For a long time I've added = to the end of my parse rules, and = to the beginning of parse variables. I think it matches the production rule grammar well, and also emulates set-word/get-word syntax.
Maxim: 13-Dec-2009	I've used word= for other things before and I liked it.
Gregg: 14-Dec-2009	Yup. Different mindset. I just looked at your BNF compiler earlier. Good stuff. I did an ABNF-to-parse generator some time back. ABNF is used in a lot of IETF RFCs and such.
Maxim: 14-Dec-2009	that is nice, is your ABNF parser still accessiblel somewhere? it could improve the quatily and ease of integrating the protocols to R3 IMO. ABNF also seems much more aligned to parse
Maxim: 15-Dec-2009	I've been rewriting bnf generated parse rules (and often a bit cryptically) into proper parse ordered rules for 3 days now... <sigh> C is sooo complex for what it really does. I''ve discovered a few quite mind-boggling language capabilities... stuff like: char ( (*var)() )[10]; it takes 7 steps to define what that really is and there are other "fun" examples which end up being interpretation nightmares, but look really simple. one thing is certain at this point... although I will be able to build a C to rebol converter with relative precision under specific goals, some of the crazy stuff just will have to be finished manually by humans. at least I rarely see such twisted C code in most of what I've been reading so far.
BrianH: 16-Dec-2009	BNF is just a syntax form, with a lot of variation. The real difference that matters between Yacc and PARSE is the parsing model. Yacc implements an LR parser (or some variant thereof), and PARSE implements a variant of TDPL parsing (related to PEG), though more powerful and with a completely different syntax. How you structure the parse rules depends on the parsing model, not the syntax. For instance, LR parsers tend to do recursion rather than iteration, and when they recurse the recrsive call tends to be on the left, with the distinguishing clause on the right. For PEG parsers, recursion goes the other way. This is not an error, this is a difference in parsing model. If you are translating from Yacc to PARSE, it's not just a syntax change. You have to reorganize the rules to match the new model. And watch out: Certain patterns are easier to express in some parsing models than in others. Some patterns aren't supported at all in some models, and so no amount of translation will help you. We chose the TDPL model for PARSE because it is more expressive than the LR model, so in theory you should be able to translate LR rules to PARSE with some topological twists (redoing the sturcture of the rules). However, there are patterns that you can express in PARSE that can't be translated to LR, even with topological changes.
Maxim: 16-Dec-2009	my goal is to get the host code and OpenGL headers past the parsing phase. once that is done, I'll start work on adding the production phase. I still have to write the pre-processor, but that in fact is pretty straight forward. there are little rules and they are much more static and well defined on the MS web site.
Maxim: 16-Dec-2009	the funny thing is that the C language reference on the MSDN is actually pretty well done... there are a lot of evil C examples for some of the more obscure parts of the language like pointers, structs and unions. funny thing is that some of the most complex things to express where the litteral constants! integers, with octal, hex notation... not as simple as some [digits] ;-)
Henrik: 24-Dec-2009	Looking at the new WHILE keyword and I was quite baffled by Carl's use of it in his latest blog example. Then I read the docs and it didn't get much better: - WHILE is a variant of ANY - ANY stops, if input does not change - WHILE doesn't stop, even if input does not change What does "input does not change" mean? Is it about changing the parse series length during parse? Is it actively moving the parse index back or forth using special commands? Is it normal progression of parse index with each cycle of WHILE or ANY? Is it alteration of the parse series content while maintaining length during parse?
Pekr: 24-Dec-2009	Henrik - according to docs explanation, 'parse contains some internal protection for the case, when input stream does not advance its position. In R2, following code causes infinite loop, in R3, it returns false: parse str [some [to "abc"]] (I am not sure I like that it returns false - normally I expect it to cause infinite loop. This is imo overprotecting programmer, and you have to think, why your code returns false anyway, which for me is the same, as if it would cause an infinite loop) Further from docs: To avoid infinite looping, a special internal rule is triggered based on the fact that the rule did not change the input position. However, this shows a problem with this rule: parse str [some [to "a" remove thru "b"]] Here the input did not appear to advance, but something useful happened. In such cases, the some word should not be used, and the while word is better: parse str [while [to "a" remove thru "b"]]
Pekr: 24-Dec-2009	I don't probably understand usefullness of 'while at all. Because now I have to think, if my code would cause infinite loop, or not, and use 'some or 'while accordingly ...
Pekr: 24-Dec-2009	Running above examples, my opinion is, that in fact adding 'while was probably not a good decision. I can understand, that now we have more power - our code will not easily cause an infinite loops, but otoh you now have to think, if it can happen or not, and 'some becomes your enemy ...
Fork: 28-Dec-2009	?? not initialized after first match? And secondly, how do I match thru a series of things (e.g. integer! integer!, but just wondering about the thte. ?? problem before the first match?)

23101 / 48606

[232]