World: r3wp
[REBOL Syntax] Discussions about REBOL syntax
older newer | first last |
BrianH 19-Feb-2012 [269x2] | When people wanted to refer to the < word in R2, and they can't use the lit-word syntax for arrow words in R3 and pre-a97 R3, one way is to store that word in a block and use FIRST to get the value. However, in R2 that resulted in a value that LOAD choked on. The <] tradeoff was made really early on in the R3 project to solve that issue. The alternative would be to make MOLD mold [<] as [< ], or more specifically to make < mold as "< ", with an extra space every time. |
in R3 and pre-a97 R3 -> in R2 and pre-a97 R3 | |
Steeve 19-Feb-2012 [271] | I would add it's easy bypassed in R2 if one insert a blank after < >> [< ] ==[<] |
BrianH 19-Feb-2012 [272] | The way MOLD is written, the values are molded by code that doesn't know it's in a block. You could have the ] handling code check against a charset of iffy characters and then optionally insert an extra space if found, but that doesn't deal with user-written code where [>] works and [<] doesn't. The usage of ] as the first character in a tag is so rare that it's not a bad tradeoff to make. |
Steeve 19-Feb-2012 [273x3] | Well, I agree |
Introducing email! datatype next. form: '?[*-:-*'] ':' may be in the first position only '<' can't be in the first position '%FF' escaping chars in hexa notation | |
Both R2, R3 escape-uri: [#"%" 2 hex-digit] email-char: complement union charset {%@:} termination-char email-syntax: [ [#":" | not #"<" email-char | escape-uri] any [email-char | escape-uri] #"@" any [email-char | escape-uri] termination ] | |
Andreas 19-Feb-2012 [276x2] | Hmm, when : is in the first position, a : can occur anywhere afterwards as well. |
For example, [:a:@:b:] | |
Steeve 19-Feb-2012 [278] | not anymore an email! but an url! then |
Andreas 19-Feb-2012 [279] | Not in R3. |
Steeve 19-Feb-2012 [280x5] | right |
right | |
right | |
good catch, true in R2 also | |
Arg, It will be hard to keep the rule tight | |
BrianH 19-Feb-2012 [285] | I figure that we should look at the email formatting standard, then subtract support for any syntax that would conflict with something else in REBOL, especially if that doesn't commonly show up in actual email addresses. We've already made some tradeoffs in favor of email (i.e. no @ in issues or words), maybe we want to make more. |
Andreas 19-Feb-2012 [286] | Where would we "want" to do that? |
BrianH 19-Feb-2012 [287] | Doesn't work for R2 though - that syntax just needs to be documented, it can't be changed. |
Andreas 19-Feb-2012 [288x2] | Or how would such a desire reflect? |
In filing CC issues? | |
BrianH 19-Feb-2012 [290x2] | When I was trying to replicate the R3 word syntax, it was partly to document R3, partly to serve as the basis of a more flexible TRANSCODE that would allow people to handle more sloppy syntax without removing the valuable errors from the regular TRANSCODE, but mostly it served to generate new CC tickets for syntax bugs that we weren't aware of because the syntax wasn't well enough documented, and they hadn't come up in practice yet. |
There is a large, unknown number of such bugs in URL syntax, for instance. I wouldn't be surprised if that is the case with email too. | |
Andreas 19-Feb-2012 [292x2] | If it's obvious bugs, that's comparatively easy, yes. |
Your initial message above sounded more like wishes towards a more restricted email!. | |
BrianH 19-Feb-2012 [294x2] | A more thorough examination of the syntax makes more of these bugs obvious. |
I don't necessarily want a more restricted email! than it is already, but if we are expanding what is possible with email!, it will still likely need to be restricted relative to the email standard. | |
Andreas 19-Feb-2012 [296] | We are not expanding anything :) We are just describing what syntactical rules the REBOL email! literal syntax follows. |
BrianH 19-Feb-2012 [297] | I'm a little more concerned with R3 URL syntax though, since in that case there are real bugs that have already affected people in real cases, and because hypothetically a lot of the bugs are fixable in mezzanine code. |
Andreas 19-Feb-2012 [298] | And as the email! datatype can be used for many a purpose within dialects, it does not necessarily have to match RFC822 (or rather 5322) exactly. |
Steeve 19-Feb-2012 [299] | but the syntax checking can't be corrected witth mezzs right ? |
Andreas 19-Feb-2012 [300] | (Which would be a relatively complex problem anyway ... http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html) |
BrianH 19-Feb-2012 [301x2] | Steeve: For emails, no. For urls, yes. |
For url! the syntax checking is mostly done by the DECODE-URL mezzanine. We can't change what is recognized as a url! by REBOL, but we can change how the data is treated once it's recognized. There are errors in escape handling, for instance. | |
Steeve 19-Feb-2012 [303] | Corrected version, works with R2 and R3: escape-uri: [#"%" 2 hex-digit] email-char: complement union charset {%@:} termination-char email-esc: [email-char | escape-uri] email-syntax: [ [ #":" any [email-esc | #":" ] #"@" any [email-esc | #":" ] | not #"<" some email-esc #"@" any email-esc ] termination ] |
Andreas 19-Feb-2012 [304] | Ah, was wondering. So we can't change the syntax or url!s in R3 as well, we can only improve/bugfix url! handling. |
BrianH 19-Feb-2012 [305] | You'd be surprised at how flexible the syntax of url! is in R3 :) |
Andreas 19-Feb-2012 [306] | I don't think I would. |
BrianH 19-Feb-2012 [307x2] | Fair enough. But if you can figure out exactly hor MOLD handles escaping of urls, that would help narrow down what bugs we can fix in DECODE-URL. |
hor -> how | |
Andreas 19-Feb-2012 [309] | I would be slightly surprised if it is more flexible than string syntax, but I somehow doubt that :) |
BrianH 19-Feb-2012 [310] | Fewer escaping methods, so no. What's weird is that some kinds of string escaping work for the file! type. |
Steeve 20-Feb-2012 [311] | It's calm here |
Ladislav 20-Feb-2012 [312x2] | committed a couple of 1903-5 additions. You were right that #1905 is ugly, Steeve. |
Caught up with the code posted above. | |
Steeve 23-Feb-2012 [314x5] | url! syntax (both R2,R3) I've not created specific charsets, so the rule is more verbose. - The first char! same as for word! (less "+-") - Must contain at least one ':' - "/" Allowed only after the first ":" - Escape-uri allowed like in email! url-syntax: [ not digit not #"'" not sign word-char any [escape-uri | not termination-char not #":" skip] #":" any [escape-uri | #"/" | not termination-char skip] ] |
Forgot the case when it begins with '"." I should have stick with the word-syntax much closer | |
url-syntax: [ [#"." not digit | not digit not #"'" not sign word-char] any [escape-uri | not termination-char not #":" skip] #":" any [escape-uri | #"/" | not termination-char skip] ] | |
hum... still wrong | |
url-syntax: [ not [digit | #"'" | #"." digit | sign] word-char any [escape-uri | not termination-char not #":" skip] #":" any [escape-uri | #"/" | not termination-char skip] ] | |
older newer | first last |