World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
Ladislav 1-Dec-2010 [5353x2] | some [ thru {<h2><a} thru ">" copy name to {<} doc-start: to {^/ </div>} doc-end: :doc-start [ thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} here: if (lesser? index? here index? doc-end) copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} (printf [" * " 10 " - "] reduce [arg arg-desc]) ] ] :doc-end ] |
Nevertheless, both the variant you posted, as well as the variant I posted parse a part of the text more than once. A variant parsing the text only once can be written as well. | |
Steeve 1-Dec-2010 [5355] | this should work with R3: some [ thru {<h2><a} thru ">" copy name to {<} copy doc to {^/ </div>} :doc thru {<pre class="code">} copy code to {</pre} ( probe name probe code ) any [ thru {<h5>} copy arg to {<} thru {<ol><p>} copy arg-desc to {</p></ol>} ( printf [" * " 10 " - "] reduce [arg arg-desc] ) ] ] notice the :doc, which allows to switch the current input parsed |
Oldes 1-Dec-2010 [5356] | The last Ladislav's version is working, but it's far to be easy to use for lazy parsing. I think that I will stay with my version;-) |
Ladislav 1-Dec-2010 [5357] | Use what suits your needs best. Nevertheless, as far as code size, etc. are compared, they are the same (even sharing the property, that the part of code is parsed twice). |
BrianH 1-Dec-2010 [5358] | Was Carl's proposed LIMIT keyword implemented yet? |
Ladislav 1-Dec-2010 [5359] | Not yet, I guess. |
BrianH 1-Dec-2010 [5360x2] | That is what he proposed to deal with this issue. I look forward to it. |
I use the new INTO string feature a lot with file management code. | |
Steeve 1-Dec-2010 [5362] | Hey you don't like my solution ? it's simple enough, don't need of LIMIT |
BrianH 1-Dec-2010 [5363x2] | I like your solution, and would use solutions like that. I just would prefer to have LIMIT :) |
It has a lot of overhead though (copy overhead). | |
Steeve 1-Dec-2010 [5365] | I knew, you would say that... :-) |
BrianH 1-Dec-2010 [5366] | You have to be careful with INTO string though because there is a lot of PARSE code out there that depended on INTO failing with non-blocks, and triggering an alternation. Learn to like AND type INTO if your code depends on that. |
Ladislav 1-Dec-2010 [5367] | Hey you don't like my solution ? - I guess, that Oldes does not like it, since it does not "stay in the limit" of DOC |
Steeve 1-Dec-2010 [5368] | He just has to switch again that. I don't think he even understood or read what I proposed |
BrianH 1-Dec-2010 [5369] | Ah, I must have misunderstood the solution then. I thought it was a two-pass thing with subparsing of a copy. |
Ladislav 1-Dec-2010 [5370] | aha, you would like to use the get-word to switch the input |
Steeve 1-Dec-2010 [5371] | Brian, it is |
Ladislav 1-Dec-2010 [5372] | I have seen that proposed, but it is not available currently (I would support such a proposal, though) |
Steeve 1-Dec-2010 [5373x2] | Ladislav, It's working in R3 currently |
since a while | |
Ladislav 1-Dec-2010 [5375] | Checking |
BrianH 1-Dec-2010 [5376] | That's what I thought. I originally proposed that kind of input switching in 2000. It can cause problems with backtracking though, so a sub-parse in an IF operation can be safer. |
Steeve 1-Dec-2010 [5377] | Lot of back tracking problems may arrise in a lot of way when you do parsing. I'm not sure it's an argument :-) |
BrianH 1-Dec-2010 [5378] | There are no backtracking problems with COPY x TO thing IF (parse x rule). But that was the original reason we didn't have input switching. The reason I requested input switching in the first place was to make it easier to implement the continuous parsing that Pekr was requesting at the time :) |
Steeve 1-Dec-2010 [5379] | I don'"t see where is the problem, you just have to switch back the original serie if the sub-rule fail, no need of if (parse ...) thing |
BrianH 1-Dec-2010 [5380] | Originally, only position was reset, not series reference. I would welcome tests that show what the current behavior is. Ladislav? |
Steeve 1-Dec-2010 [5381x2] | I didn't say the reference was restored automaticly , you have to do it yourself. |
.... restore: [:switch-serie sub-rule | :restore] | |
BrianH 1-Dec-2010 [5383] | If it is not restored automatically on failure, backtracking and alternation, then that is a problem that needs a ticket submitted for it. |
Steeve 1-Dec-2010 [5384x2] | Agreed, It would be a nice improvment |
but it may slow down parse, no ? | |
BrianH 1-Dec-2010 [5386] | Not much, just one more pointer assignment at alternation. |
Steeve 1-Dec-2010 [5387] | Ladislav, are you lost in translation ? Or are you crying :-) |
BrianH 1-Dec-2010 [5388x4] | It fails. Here is the test code that I will put in the ticket: >> a: "a" b: "b" parse a [:b "b" (print true) fail | "a"] true == false ; should be true >> a: "a" b: "b" parse a [:b "b" (print true) fail | "b"] true == true ; should be false |
So, half of the request succeeds: You can set the position to another series. I wonder if you can change series types from string to block. | |
Yup, you can. | |
It is not a simple problem though, as not only would you have to add a series reference to the fallback state but you would need to make those series references visible to the garbage collector so they won't be freed; backtracking to a freed series would be bad. | |
Steeve 1-Dec-2010 [5392x2] | parse is freeing is own allocated ressources currenlty, what would that be a problem to pursue ? |
*why would that be... | |
BrianH 1-Dec-2010 [5394] | What if someone runs RECYCLE in a paren? It would need to know what to not collect. |
Steeve 1-Dec-2010 [5395] | I mean, Parse must use a sort of stack to keep the backtracking references. The series will not be freed until parse destroy his stack |
BrianH 1-Dec-2010 [5396] | Right now it is a stack of integers (position) and a single pointer (series reference). To do this it would need to be a stack of series references too, and the collector would need to be informed of its exdistence so it could scan it for references. |
Steeve 1-Dec-2010 [5397] | That's why I said previously, it may slown down the whole process. |
BrianH 1-Dec-2010 [5398] | Yup. The ticket needs to be made either way. If it is rejected it will serve as documentation of the issue. |
Ladislav 1-Dec-2010 [5399x2] | It can cause problems with backtracking though - actually, it can't, as can be demonstrated easily |
(when implemented properly, of course) | |
BrianH 1-Dec-2010 [5401] | Submitted as #1787, with the "when implemented properly" workarounds that Ladislav was mentioning. Note: Just because there is a solution to a problem doesn't make it not a problem - it just makes it a problem that can be solved. |
Ladislav 1-Dec-2010 [5402] | aha, so, now the get-words can set parse to a different series (INTO does that as well!), but, what is restored, is just the index, not the series... (except for the return from INTO, when the series is restored as well |
older newer | first last |