World: r3wp
[Parse] Discussion of PARSE dialect
older newer | first last |
BrianH 17-Nov-2008 [3212x6] | About your matching from a block proposal, if the CHECK proposal gets accepted then I doubt this will - the usage scenarios where you can't just use alternates would be too rare, especially given how easy CHECK (FIND ...) could do the job in those cases. |
Your example with alternates (and bug fixes, still ignoring leap years): m31: ["Jan" | "Mar" | "May" | "Jul" | "Aug" | "Oct" | "Dec"] ; joins were in wrong direction m30: join m31 [| "Apr" | "Jun" | "Sep" | "Nov"] m28: join m30 [| "Feb"] b28: next repeat x 28 [repend [] ['| form x]] ; next to skip leading |, numbers don't work in string parsing b30: ["29" | "30"] ; optimization based on above reversed joins b31: ["31"] parse date-str [ b28 "-" m28 | b30 "-" m30 | b31 "-" m31 ] The above with CHECK instead: m31: ["Jan" "Mar" "May" "Jul" "Aug" "Oct" "Dec"] m30: join m31 ["Apr" "Jun" "Sep" "Nov"] m28: join m30 ["Feb"] b28: repeat x 28 [append [] form x] ; not assuming b30: ["29" "30"] ; optimization based on above reversed joins b31: ["31"] parse date-str [ copy d some digit "-" copy m some alpha check ( any [ all [find b31 d find m31 m] all [find b30 d find m30 m] all [find b28 d find m28 m] ]) ] Which would be faster would depend on the data and scenario. | |
(the comments on the second example can be ignored) | |
Your proposal seems like a slightly faster but more limited version of alternates, and not as flexible or optimizable as check. Does this situation come up so often that you need direct support for it? | |
Here's a simpler date checker with CHECK: parse date-str [copy d [1 2 digit "-" 3 alpha "-" 4 digit] check (attempt [to-date d])] | |
That requires years too, but at least it gets leap year 29-Feb. | |
Gabriele 17-Nov-2008 [3218] | Brian, JOIN does a REDUCE on the second block. |
BrianH 17-Nov-2008 [3219] | Right you are, whoops. It's been a while since I used it with blocks. |
Chris 18-Nov-2008 [3220x3] | 'append would do it... numbers don't work in string parsing - I thought about this when I developed the example, thought it might be possible as the numbers appear outside the dialect. But 'check seems like the better option. joins were in the wrong direction - d'oh! simpler date checker - that's only useful if to-date recognizes the date format : ) (and using dates was illustrative - there are other situations with similar needs). Though on dates, what would be the most succinct way with the proposals on the table to do the following? ameridate: "2/15/2008" parse ameridate ...rule... newdate = 15-Feb-2008 One attempt: parse ameridate [ use [d m][ change [copy m 1 2 digit "/" copy d 1 2 digit] (rejoin [d "/" m]) ] "/" 4 digit end check (newdate: to-date ameridate) ] |
(making the assumption it is a valid date) | |
(and that it's ok tomodify the original string) | |
eFishAnt 22-Nov-2008 [3223x4] | If I am parsing something like javascript that has { and } in it like C, how can I put that into a string to parse without using {} |
my test cases for testing a dialect, I usually use this form: | |
print parse/all {test case that can't have {} inside } parse-rule | |
There must be some dead-simple guru trick for this... | |
Sunanda 22-Nov-2008 [3227] | I think I raised the same question on the ML years ago, and got a disappointing answer. Maybe things have changed since. Or, if not, it may not be too late to add to the R3 parse wishlist: http://www.rebol.org/ml-display-thread.r?m=rmlSQHQ |
eFishAnt 22-Nov-2008 [3228x3] | My first inclination is to "go binary" on it...;-) but that is inelegant, and the MSB of binary gets wonky sometimes. |
(MSB bit is a sign bit oftentimes) | |
I was thinking of taking away the special meaning of { but not yet sure how to unset it...it initially seems hardcoded in there....not like I would expect. | |
Oldes 22-Nov-2008 [3231] | I don't understand what do you mean. |
eFishAnt 22-Nov-2008 [3232x2] | In the console if you type a { and then hit Enter, it continues on the next line. :{ and }: don't seem to work, either. |
The problem is if you have a string of Javascript which uses {} inside, then it is hard to get REBOL to make a string of Javascript that has {} inside. | |
Sunanda 22-Nov-2008 [3234] | Replace "{" with to-char 0 then put it back afterwards? (Assuming to-char 0 does not occur in your string)? I've done that sort of thing before to get around parse limitations (whether the limitation is in 'parse or my understanding of it) |
Oldes 22-Nov-2008 [3235] | what about escaping? |
eFishAnt 22-Nov-2008 [3236] | so ^{ is what you mean by escaping? |
Oldes 22-Nov-2008 [3237] | yes |
eFishAnt 22-Nov-2008 [3238] | that's the old XON/XOFF sw handshaking trick. So you mean to add ^ in the Javascript, and then wrapping it with {} won't cause a script error? |
Oldes 22-Nov-2008 [3239x2] | You want to write JS in Rebol console? |
>> s: {a^/{b}c^{d^}"e"} == {a {b}c{d}"e"} | |
eFishAnt 22-Nov-2008 [3241x4] | no, I am parsing it automatically...the console I was just testing, but that makes the problem worse, methinks |
I think you meant this (or at least I would have wanted you to mean this): >> s: {a^{b^}c^{d^}"e"} == {a{b}c{d}"e"} | |
that still doesn't cut the mustard for me...but can be useful for part of it. to-string 4838f{gmgmg{ ;this doesn't work because REBOL doesn't know what datatype 4838f{gmgmg{ is, but I don't want REBOL to know that. I want REBOL to make it into a string. | |
to-string/raw or to-string/force could be a useful refinement so that I could write parse like this: print parse/all to-string/force ;alksjdf;alsjdflk;{""""}}}}} parse-rules ;I think this would be a VERY useful thing. | |
Oldes 22-Nov-2008 [3245] | I don't understand how it could work like |
eFishAnt 22-Nov-2008 [3246] | it would need a terminator pattern to know when to stop it print parse/all to-string/force ;alksjdf;alsjdflk;{""""}}}}}terminate parse-rules ;I think this would be a VERY useful thing. |
Oldes 22-Nov-2008 [3247x3] | you mean something like that http://en.wikipedia.org/wiki/HEREDOC ? That could be useful. |
but that's not parse related, the Rebol lexer should be improved to do that. | |
we can try to ask Carl | |
eFishAnt 22-Nov-2008 [3250x4] | It is strongly related...for sure. One of my requirements in a good OS which is not done yet is to NEVER lose anything I type, EVER. |
That page of Wiki is very good to beg my case, thanks! | |
I remember another cool console that someone did in a web page that was very slick...sorry, my mind associates things that it shouldn't...my parser is too massively parallel...;-) | |
It's like closing the loop on Chris's make-doc-anywhere...but REBOL can be the most elegant here for that Heredoc. | |
BrianH 22-Nov-2008 [3254x2] | eFishAnt, the {} nesting rules only apply to strings entered in {}, not "". They are also only a REBOL syntax trick. Strings that you construct or read from somewhere don't have any special escaping rules or syntax. |
REBOL doesn't lose what you type, but if what you are typing is treated as REBOL syntax, then it is interpreted as such, not lost. | |
Robert 23-Nov-2008 [3256x2] | I have a problem with sub-rule parsing I have: myrules: [ any [ set src path! set des path! (?? src ?? des) | 'keyword into [ ... ] ] ] If the input is a/1 a/2 mykeyword a/1 a/2 Only the first paths are printed. Why this? I thought the ANY rule keeps going on unti the end. |
I think it has something to do with INTO not return true. | |
Oldes 23-Nov-2008 [3258] | The INTO is still just a proposal http://www.rebol.net/wiki/Parse_Project#INTO |
Anton 23-Nov-2008 [3259] | Robert, your rule steps INTO the second a/1, and then, I suppose, fails to reach the end of it. Show us the INTO rule, and we will tell you what is wrong with it. PARSE will treat the path a/1 like the block [a 1] eg: >> parse [a/1][into [here: (print mold here) 'a 1 1 1]] [a 1] == true (You also wrote 'keyword in your rules, but 'mykeyword in your input. I'm hoping that was just a typo.) |
BrianH 23-Nov-2008 [3260] | INTO is not a proposal, the proposal is to extend INTO to string types, possibly with a type check. |
eFishAnt 23-Nov-2008 [3261] | Thanks for all the helps on parse...I have my new platform working 100% as of this morning. The labor of over a year of technology...and the code keeps getting smaller and faster. 100% REBOL. It's all dialects. |
older newer | first last |