World: r3wp
[REBOL Syntax] Discussions about REBOL syntax
older newer | first last |
BrianH 17-Feb-2012 [226] | I've been meaning to adapt those rules to R2 though. There should be more bugs there though, and unlike the bugs in the R3 syntax we can't fix them in R2. |
Steeve 17-Feb-2012 [227x4] | I don't know if it's what you mean, but ther is only need to check if a word! is optionaly followed by a tag! |
ie. for R3 value-syntax: [ block-syntax | paren-syntax | integer-syntax | decimal-syntax | char-syntax | quoted-string | braced-string | binary-syntax | tuple-syntax | word-syntax opt tag-syntax ;<=== there ] | |
No, forget it , it needs more work | |
Agreed with you | |
BrianH 17-Feb-2012 [231] | You're right, the word followed by tag works. I also just fixed another bug too: the ">" needed to be a choice after ">>". |
Steeve 17-Feb-2012 [232x2] | for R3, I think the following trick is enough word-syntax: [ ... [and tag-syntax | termination] ] |
R2 needs more | |
BrianH 17-Feb-2012 [234] | If I was doing a full parser I'd try to cut down on the lookahead parsing of more than a single character or charset, so as to avoid repeating the parsing. Plus, for R3 there's a ticket to improve tag syntax that I'd like implemented (single-quote strings in tags). For an R2 parser I suppose that an approach that is more tolerant of design flaws would be appropriate, since its syntax is in bug-for-bug backwards-compatibility mode. |
Steeve 17-Feb-2012 [235x3] | I still can't believe that the syntax has so much drawbacks |
I mean, they can't be deliberate | |
I'm curious to see how it was coded | |
BrianH 17-Feb-2012 [238x2] | There really weren't that many, and most of them were fixed in alpha 97. Some of them are deliberate tradeoffs, such as http://issue.cc/r3/1317 |
One interesting thing is that R2 and R3 have binary parsers, not text parsers. The difference doesn't matter that much in R2 but in R3 it matters a lot. | |
Steeve 17-Feb-2012 [240] | try this in R2 [a<] and [a< ] |
BrianH 17-Feb-2012 [241x2] | All of the syntax characters in R3 fit in the ASCII range. That is why there are no Unicode delimiters, such as the other space characters. |
Yup, that is why he made the tradeoff. | |
Steeve 17-Feb-2012 [243] | Annoying syntax... I take my pause for now |
Ladislav 18-Feb-2012 [244x7] | the "+<tag>" case differs from the "-<tag>" case! |
#[[BrianH >> load "a<a>" == [a <a>] Looks good to me. ]]BrianH Well, it does not look problematic at the first sight, but it does look problematic once we compare it to: >> load "1.2.3<t>" ** Syntax error: invalid "tuple" -- "1.2.3<t>" ** Where: to case load ** Near: (line 1) 1.2.3<t> | |
My opinion is that there needs to be a "common syntax rule", (either allowing #"<" as a syntax separator character or not) | |
Similarly the above load "+<tag>" and load "-<tag>" look like an inconsistency in syntax. | |
When compared to load ".<tag>" | |
I wrote http://issue.cc/r3/1903 http://issue.cc/r3/1904 http://issue.cc/r3/1905 | |
Regarding these above three. What are the preferences of potential users: a) reflect all these "as is" in the syntax.r code b) do something else? | |
Steeve 18-Feb-2012 [251x2] | We could produce several documents (Btw I don't think it's a practical idea to continue further mixing R2 and R3 syntax) - R3 pure expected syntax (without glitch, inconsistency) - R2 pure expected syntax (without glitch, inconsistency) - R3 with glichs - R2 with glichs |
Wowww mostly forgot. in R3 [#] is a shortcut for [none] | |
Ladislav 18-Feb-2012 [253] | I guess that nobody uses that. |
Steeve 18-Feb-2012 [254] | issue-char-R2: complement union charset "@" termination-char issue-char-R3: complement union charset "@$%:<>\" termination-char |
Ladislav 18-Feb-2012 [255] | OK, I will put it in |
Steeve 18-Feb-2012 [256x7] | correction: issue-char-R3: complement union charset "@$%:<>\#" termination-char |
I use a function to automate the testing of the charsets | |
test-syn: func [ chars [bitset!] sample [string!] /local c l t? ci ][ t?: type? first to-block sample repeat i 256 [ c: replace copy sample "?" ci: to-string to-char i - 1 if find ci chars [ if error? l: try [to-block c] [ l: disarm l l: reform [l/id l/arg1 l/arg2] ] if any [1 <> length? l t? <> type? l/1][ print [i - 1 mold to-char i - 1 mold l attempt [type? l/1]] ] ] ] ] | |
example for issue! >> test-syn issue-char "#?" | |
it prints out all the errors | |
works with R2 or R3 | |
but you need to have [disarm] when used with R3. I use this defintion: >> unless value? 'disarm [disarm: func[e][:e]] | |
Steeve 19-Feb-2012 [263x3] | issue-syntax-R3: [#"#" some issue-char-R3 termination] issue-syntax-R2: [#"#" any issue-char-R2 termination] |
tag-char-beg: complement union whitespace charset {=<>"^@} tag-char: complement charset {">^@} tag-syntax-R3: [#"<" [not #"]" tag-char-beg | quoted-string] any [some tag-char | quoted-string] #">" termination] tag-syntax-R2: [#"<" [tag-char-beg | quoted-string] any [some tag-char | quoted-string] #">" termination] | |
in R3 the exception with the starting #]" may be a bug | |
BrianH 19-Feb-2012 [266] | Someone's complained about it, but I think it's sn intentional fix to this bug in R2: >> [ < ] == [<] >> [<] ** Syntax Error: Invalid tag -- < ** Near: (line 1) [<] |
Steeve 19-Feb-2012 [267x2] | It's more related with a wrong doing with the tag! decoding to me |
but anyway | |
BrianH 19-Feb-2012 [269x2] | When people wanted to refer to the < word in R2, and they can't use the lit-word syntax for arrow words in R3 and pre-a97 R3, one way is to store that word in a block and use FIRST to get the value. However, in R2 that resulted in a value that LOAD choked on. The <] tradeoff was made really early on in the R3 project to solve that issue. The alternative would be to make MOLD mold [<] as [< ], or more specifically to make < mold as "< ", with an extra space every time. |
in R3 and pre-a97 R3 -> in R2 and pre-a97 R3 | |
Steeve 19-Feb-2012 [271] | I would add it's easy bypassed in R2 if one insert a blank after < >> [< ] ==[<] |
BrianH 19-Feb-2012 [272] | The way MOLD is written, the values are molded by code that doesn't know it's in a block. You could have the ] handling code check against a charset of iffy characters and then optionally insert an extra space if found, but that doesn't deal with user-written code where [>] works and [<] doesn't. The usage of ] as the first character in a tag is so rare that it's not a bad tradeoff to make. |
Steeve 19-Feb-2012 [273x3] | Well, I agree |
Introducing email! datatype next. form: '?[*-:-*'] ':' may be in the first position only '<' can't be in the first position '%FF' escaping chars in hexa notation | |
Both R2, R3 escape-uri: [#"%" 2 hex-digit] email-char: complement union charset {%@:} termination-char email-syntax: [ [#":" | not #"<" email-char | escape-uri] any [email-char | escape-uri] #"@" any [email-char | escape-uri] termination ] | |
older newer | first last |