r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[REBOL Syntax] Discussions about REBOL syntax

Ladislav
17-Feb-2012
[209]
corrected
Steeve
17-Feb-2012
[210x4]
Another thing...
a zero length word is valid with that rule
 
 -> word-syntax succeded
That way, word-syntax must have one character length à least
word-syntax: [
	[
		slash-word
		| more-less-word

  | and word-char opt sign [#"." | not #"'"] not digit any word-char
	]
	termination 
]
Ladislav
17-Feb-2012
[214]
good catch
BrianH
17-Feb-2012
[215x2]
>> load "a<a>"
== [a <a>]
Looks good to me.
There were some good reasons for the complexity of the rule in
Steeve
17-Feb-2012
[217]
Thanks I forgot to whine about that one
BrianH
17-Feb-2012
[218x3]
There were some good reasons for the complexity of the rules in http://issue.cc/r3/1302
and the error handling was necessary too, since the actual word syntax 
is only recognized after it eliminates the error conditions.
For instance, some of the words are definitely handled as special-cases, 
rather than as normal word characters. Different follow sets in those 
cases too.
Every word with / or < or > in it is a special case.
Steeve
17-Feb-2012
[221x3]
it's not < or > , it must be a valid tag!, else it"s an error
I don't think there is a need to break anything in word-syntax as 
it is
I don't think there is a need to break anything in word-syntax as 
it is
BrianH
17-Feb-2012
[224]
Actually, if it's not any one of these, specifically: "<" | ">" | 
"<>" | "<=" | ">=" | "<<" | ">>"
There aren't any other special cases in R3, I checked.
Steeve
17-Feb-2012
[225]
check your version in the A111 it"s not true anymore
BrianH
17-Feb-2012
[226]
I've been meaning to adapt those rules to R2 though. There should 
be more bugs there though, and unlike the bugs in the R3 syntax we 
can't fix them in R2.
Steeve
17-Feb-2012
[227x4]
I don't know if it's what you mean, but ther is only need to check 
if a word! is optionaly followed by a tag!
ie. for R3

value-syntax: [
	block-syntax
	| paren-syntax
	| integer-syntax
	| decimal-syntax
	| char-syntax
	| quoted-string
	| braced-string
	| binary-syntax
	| tuple-syntax
	| word-syntax opt tag-syntax  ;<=== there
]
No, forget it , it needs more work
Agreed with you
BrianH
17-Feb-2012
[231]
You're right, the word followed by tag works. I also just fixed another 
bug too: the ">" needed to be a choice after ">>".
Steeve
17-Feb-2012
[232x2]
for R3, I think the following trick is enough

word-syntax: [
	...
	[and tag-syntax | termination] 
]
R2 needs more
BrianH
17-Feb-2012
[234]
If I was doing a full parser I'd try to cut down on the lookahead 
parsing of more than a single character or charset, so as to avoid 
repeating the parsing. Plus, for R3 there's a ticket to improve tag 
syntax that I'd like implemented (single-quote strings in tags). 
For an R2 parser I suppose that an approach that is more tolerant 
of design flaws would be appropriate, since its syntax is in bug-for-bug 
backwards-compatibility mode.
Steeve
17-Feb-2012
[235x3]
I still can't believe that the syntax has so much drawbacks
I mean, they can't be deliberate
I'm curious to see how it was coded
BrianH
17-Feb-2012
[238x2]
There really weren't that many, and most of them were fixed in alpha 
97. Some of them are deliberate tradeoffs, such as http://issue.cc/r3/1317
One interesting thing is that R2 and R3 have binary parsers, not 
text parsers. The difference doesn't matter that much in R2 but in 
R3 it matters a lot.
Steeve
17-Feb-2012
[240]
try this in R2
[a<] and [a<  ]
BrianH
17-Feb-2012
[241x2]
All of the syntax characters in R3 fit in the ASCII range. That is 
why there are no Unicode delimiters, such as the other space characters.
Yup, that is why he made the tradeoff.
Steeve
17-Feb-2012
[243]
Annoying syntax... I take my pause for now
Ladislav
18-Feb-2012
[244x7]
the "+<tag>" case differs from the "-<tag>" case!
#[[BrianH
>> load "a<a>"
== [a <a>]
Looks good to me.
]]BrianH


Well, it does not look problematic at the first sight, but it does 
look problematic once we compare it to:

>> load "1.2.3<t>"
** Syntax error: invalid "tuple" -- "1.2.3<t>"
** Where: to case load
** Near: (line 1) 1.2.3<t>
My opinion is that there needs to be a "common syntax rule", (either 
allowing #"<" as a syntax separator character or not)
Similarly the above

    load "+<tag>"

and

    load "-<tag>"

look like an inconsistency in syntax.
When compared to

    load ".<tag>"
I wrote

    http://issue.cc/r3/1903
    http://issue.cc/r3/1904
    http://issue.cc/r3/1905
Regarding these above three. What are the preferences of potential 
users:

a) reflect all these "as is" in the syntax.r code
b) do something else?
Steeve
18-Feb-2012
[251x2]
We could produce several documents

(Btw I don't think it's a practical idea to continue further mixing 
R2 and R3 syntax)
- R3 pure expected syntax (without glitch, inconsistency)
- R2 pure expected syntax (without glitch, inconsistency)
- R3 with glichs
- R2 with glichs
Wowww mostly forgot.
in R3 [#] is a shortcut for [none]
Ladislav
18-Feb-2012
[253]
I guess that nobody uses that.
Steeve
18-Feb-2012
[254]
issue-char-R2: complement union charset "@" termination-char

issue-char-R3: complement union charset "@$%:<>\" termination-char
Ladislav
18-Feb-2012
[255]
OK, I will put it in
Steeve
18-Feb-2012
[256x3]
correction:

issue-char-R3: complement union charset "@$%:<>\#" termination-char
I use a function to automate the testing of the charsets
test-syn: func [
	chars [bitset!] sample [string!]
	/local c l t? ci
][
	t?: type? first to-block sample
	repeat i 256 [
		c: replace copy sample "?" ci: to-string to-char i - 1
		if find ci chars [
			if error? l: try [to-block c] [
				l: disarm l
				l: reform [l/id l/arg1 l/arg2]
			]
			if any [1 <> length? l t? <> type? l/1][
					print [i - 1 mold to-char i - 1 mold l attempt [type? l/1]]
			]
		]
	]
]