• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

AltME groups: search

Help · search scripts · search articles · search mailing list

results summary

worldhits
r4wp5907
r3wp58701
total:64608

results window for this page: [start: 64501 end: 64600]

world-name: r3wp

Group: REBOL Syntax ... Discussions about REBOL syntax [web-public]
Andreas:
14-Feb-2012
We are aiming for a purely syntactic description. So semantic checks 
at a later stage are out of the picture for now (no action blocks 
in general, and here: no loading of digits in particular).
Ladislav:
14-Feb-2012
Yes, 1000.1000.1000 is recognized as a tuple. The fact that it is 
"too big" is another check.
Ladislav:
14-Feb-2012
aha, did not check the second one, needs a correction, then
Andreas:
14-Feb-2012
A very good question. No license attached, yet. We'll fix that.
Andreas:
15-Feb-2012
Have a go at adding some more, then :)
Steeve:
15-Feb-2012
I don't want a collision. I thought your or Ladislav would finish 
it with the others making comments
GrahamC:
15-Feb-2012
Can do a merge ...
Steeve:
16-Feb-2012
The slash words are weird. Didn't know that specific syntax.
It works with R2 not with R3 anymore.

Maybe it was a mistake to allow such syntax in R2, so it's why it 
has been deprecated by Carl.
Steeve:
16-Feb-2012
Not working though (the first part is a valid refinement in R2 ):
>> [///x//]
== [///x //]
Maxim:
16-Feb-2012
I think it was mainly meant as a way to make / based ops   things 
like //  and  ///  .  I don't see why it should be removed.  it can 
only contain  "/"  characters.  otherwise its a refinement.
Steeve:
16-Feb-2012
there's no way [///x] should be a valid refinement. It will fail 
when evaluated
Ladislav:
16-Feb-2012
There is already a method for that
Steeve:
16-Feb-2012
Oh I thought you would say : a new I-pad or I-phone :-)
Ladislav:
16-Feb-2012
how about the + and - words? Is there a ticket for the refinement 
syntax?
Steeve:
16-Feb-2012
Oh, I see. you mean that "^@" is a problem wherever it is placed 
in a script
Steeve:
16-Feb-2012
Well there are some annoyng  cases when values are stuck together 
but valids.
>> [a< 1]
== [a < 1]
Steeve:
16-Feb-2012
I need a pause :-)
BrianH:
16-Feb-2012
Word syntax (for R3), in a comment here: http://issue.cc/r3/1302
BrianH:
16-Feb-2012
In a full syntax description, some of those charsets would be named 
differently of course.
Steeve:
17-Feb-2012
You raise a good question Brian. Should the rules emit their own 
errors ? 

I guess the intent is to keep them light for now. If an error occurs, 
the parsing just stop.

Another guess is that alternative rules (eg. inside value-syntax) 
should be kept order independent when possible (with the drawback 
that they are slightly more convoluted than necessary).
Ladislav:
17-Feb-2012
I committed a slightly different variant (but based on your findings 
as well). Could you (or other volunteers) check it, please?
Steeve:
17-Feb-2012
Remaining problems with [termination] in word-syntax.
When a word is stuck with a less-word or a tag (R2 and R3)
a<
 == [a <] valid
a<=
 == [a <=] valid
a<>
 == [a <>] valid
a<<
 == [a <<] valid
a<tag>
 == [a <tag>] valid

a>=
 invalid
a>
 invalid
a>>
 invalid

IMO, t's enough to check if the following char is #"<" only

word-syntax: [
	slash-word termination
	| more-less-word termination
	| opt sign [#"." | not #"'"] not digit any word-char
		[termination | and #"<"]
]
Ladislav:
17-Feb-2012
a<
 == [a <] - this does not work in R3
Steeve:
17-Feb-2012
Humm... a bit old, well with the last version you're true
Steeve:
17-Feb-2012
but, it remains a problem when followed by a tag
Ladislav:
17-Feb-2012
remains a problem when followed by a tag
 - indeed, needs adjustment then
Ladislav:
17-Feb-2012
A question for Brian: do you think the case:


load "a<a>" ; == [a <a>] shall be mentioned in CC? (and, eventually, 
where?, a new ticket or an old one?)
Ladislav:
17-Feb-2012
That is because #"/" is a separator in paths
Steeve:
17-Feb-2012
well it's a problem for the other datatypes which use [termination]
Steeve:
17-Feb-2012
example: word-syntax
with the current ruke a word can be terlinated with #"/'
Ladislav:
17-Feb-2012
yes, but that is not a problem
Ladislav:
17-Feb-2012
Yes, but that is not a problem for the word, it is an invalid path
Steeve:
17-Feb-2012
[aaaa/] is not a word! it's a path!
Maxim:
17-Feb-2012
the AND is a look ahead.  it doesn't advance the input, so whatever 
is matched by    [ AND termination ] only tries to find a delimiter.
Steeve:
17-Feb-2012
Al least it's a problem with [termination] used in tuple-syntax and 
decimal-syntax.
Don't say i'm wrong here again :-)
Ladislav:
17-Feb-2012
that is not a valid path, but this one is:

>> type? second load "a/1.2.3/b"
== tuple!
Steeve:
17-Feb-2012
it may be valid but it has nothing to do with a path anymore
Gregg:
17-Feb-2012
Paths have to start with a word, don't they?
Gregg:
17-Feb-2012
Otherwise, they load as a block.
Maxim:
17-Feb-2012
yep.  so the above is not a path   :-)   its a tuple, followed by 
a refinement.
Steeve:
17-Feb-2012
Argh sorry
>> [a/1.2.3/a]
== [a/1.2.3/a]
valid path
Steeve:
17-Feb-2012
>> [a/0.3/a]
** Syntax error: invalid "decimal" -- "0.3"
** Near: (line 1) [a/0.3/a]
Steeve:
17-Feb-2012
>> [a/0.3/a]
** Syntax error: invalid "decimal" -- "0.3"
** Near: (line 1) [a/0.3/a]
Ladislav:
17-Feb-2012
Hmm, this must be a LOAD bug, I think
Ladislav:
17-Feb-2012
>> type? second load "a/1.1"
== decimal!
Steeve:
17-Feb-2012
neither this in R3
>> [0.3/a]
** Syntax error: invalid "decimal" -- "0.3"
** Near: (line 1) [0.3/a]
Ladislav:
17-Feb-2012
That really deserves a CC ticket
Andreas:
17-Feb-2012
Looks like a bug indeed.
Ladislav:
17-Feb-2012
>> a: [0.3 xxx]
== [0.3 xxx]

>> a/0.3
== xxx
Steeve:
17-Feb-2012
So you got me with the "it's a bug !"
Okkkkkkkkk :)
Steeve:
17-Feb-2012
At least it should be noted as a comment in the source
Steeve:
17-Feb-2012
I still think there is a problem with that form [aaa/]

It will be checked like 2 separate valid words although it's an invalid 
path
Maxim:
17-Feb-2012
btw, I was not able to use decimals in paths in R2.
>> a: [0.3 test]
== [0.3 test]
>> a/0.3
== none
Maxim:
17-Feb-2012
I'm thinking its a rounding/precision error.
Ladislav:
17-Feb-2012
A change committed. "aaa/" is not accepted now, however, there are 
differences we should discuss.
Steeve:
17-Feb-2012
a zero length word is valid with that rule
BrianH:
17-Feb-2012
>> load "a<a>"
== [a <a>]
Looks good to me.
BrianH:
17-Feb-2012
Every word with / or < or > in it is a special case.
Steeve:
17-Feb-2012
it's not < or > , it must be a valid tag!, else it"s an error
Steeve:
17-Feb-2012
I don't think there is a need to break anything in word-syntax as 
it is
Steeve:
17-Feb-2012
I don't think there is a need to break anything in word-syntax as 
it is
Steeve:
17-Feb-2012
I don't know if it's what you mean, but ther is only need to check 
if a word! is optionaly followed by a tag!
BrianH:
17-Feb-2012
You're right, the word followed by tag works. I also just fixed another 
bug too: the ">" needed to be a choice after ">>".
BrianH:
17-Feb-2012
If I was doing a full parser I'd try to cut down on the lookahead 
parsing of more than a single character or charset, so as to avoid 
repeating the parsing. Plus, for R3 there's a ticket to improve tag 
syntax that I'd like implemented (single-quote strings in tags). 
For an R2 parser I suppose that an approach that is more tolerant 
of design flaws would be appropriate, since its syntax is in bug-for-bug 
backwards-compatibility mode.
BrianH:
17-Feb-2012
One interesting thing is that R2 and R3 have binary parsers, not 
text parsers. The difference doesn't matter that much in R2 but in 
R3 it matters a lot.
Steeve:
17-Feb-2012
try this in R2
[a<] and [a<  ]
Ladislav:
18-Feb-2012
#[[BrianH
>> load "a<a>"
== [a <a>]
Looks good to me.
]]BrianH


Well, it does not look problematic at the first sight, but it does 
look problematic once we compare it to:

>> load "1.2.3<t>"
** Syntax error: invalid "tuple" -- "1.2.3<t>"
** Where: to case load
** Near: (line 1) 1.2.3<t>
Ladislav:
18-Feb-2012
My opinion is that there needs to be a "common syntax rule", (either 
allowing #"<" as a syntax separator character or not)
Ladislav:
18-Feb-2012
Regarding these above three. What are the preferences of potential 
users:

a) reflect all these "as is" in the syntax.r code
b) do something else?
Steeve:
18-Feb-2012
We could produce several documents

(Btw I don't think it's a practical idea to continue further mixing 
R2 and R3 syntax)
- R3 pure expected syntax (without glitch, inconsistency)
- R2 pure expected syntax (without glitch, inconsistency)
- R3 with glichs
- R2 with glichs
Steeve:
18-Feb-2012
Wowww mostly forgot.
in R3 [#] is a shortcut for [none]
Steeve:
18-Feb-2012
I use a function to automate the testing of the charsets
Steeve:
19-Feb-2012
in R3 the exception with the starting #]" may be a bug
Steeve:
19-Feb-2012
It's more related with a wrong doing with the tag! decoding to me
BrianH:
19-Feb-2012
When people wanted to refer to the < word in R2, and they can't use 
the lit-word syntax for arrow words in R3 and pre-a97 R3, one way 
is to store that word in a block and use FIRST to get the value. 
However, in R2 that resulted in a value that LOAD choked on. The 
<] tradeoff was made really early on in the R3 project to solve that 
issue. The alternative would be to make MOLD mold [<] as [< ], or 
more specifically to make < mold as "< ", with an extra space every 
time.
Steeve:
19-Feb-2012
I would add it's easy bypassed in R2 if one insert a blank after 
<
>> [<  ]
==[<]
BrianH:
19-Feb-2012
The way MOLD is written, the values are molded by code that doesn't 
know it's in a block. You could have the ] handling code check against 
a charset of iffy characters and then optionally insert an extra 
space if found, but that doesn't deal with user-written code where 
[>] works and [<] doesn't. The usage of ] as the first character 
in a tag is so rare that it's not a bad tradeoff to make.
Andreas:
19-Feb-2012
Hmm, when : is in the first position, a : can occur anywhere afterwards 
as well.
Andreas:
19-Feb-2012
For example, [:a:@:b:]
Andreas:
19-Feb-2012
Or how would such a desire reflect?
BrianH:
19-Feb-2012
When I was trying to replicate the R3 word syntax, it was partly 
to document R3, partly to serve as the basis of a more flexible TRANSCODE 
that would allow people to handle more sloppy syntax without removing 
the valuable errors from the regular TRANSCODE, but mostly it served 
to generate new CC tickets for syntax bugs that we weren't aware 
of because the syntax wasn't well enough documented, and they hadn't 
come up in practice yet.
BrianH:
19-Feb-2012
There is a large, unknown number of such bugs in URL syntax, for 
instance. I wouldn't be surprised if that is the case with email 
too.
Andreas:
19-Feb-2012
Your initial message above sounded more like wishes towards a more 
restricted email!.
BrianH:
19-Feb-2012
A more thorough examination of the syntax makes more of these bugs 
obvious.
BrianH:
19-Feb-2012
I don't necessarily want a more restricted email! than it is already, 
but if we are expanding what is possible with email!, it will still 
likely need to be restricted relative to the email standard.
BrianH:
19-Feb-2012
I'm a little more concerned with R3 URL syntax though, since in that 
case there are real bugs that have already affected people in real 
cases, and because hypothetically a lot of the bugs are fixable in 
mezzanine code.
Andreas:
19-Feb-2012
And as the email! datatype can be used for many a purpose within 
dialects, it does not necessarily have to match RFC822 (or rather 
5322) exactly.
Andreas:
19-Feb-2012
(Which would be a relatively complex problem anyway ...

http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html)
BrianH:
19-Feb-2012
For url! the syntax checking is mostly done by the DECODE-URL mezzanine. 
We can't change what is recognized as a url! by REBOL, but we can 
change how the data is treated once it's recognized. There are errors 
in escape handling, for instance.
Ladislav:
20-Feb-2012
committed a couple of 1903-5 additions. You were right that #1905 
is ugly, Steeve.
BrianH:
23-Feb-2012
That's a good start! I'm really curious about whether ulrs and emails 
deal with chars over 127, especially in R3. As far as I know, the 
URI standards don't support them directly, but various internationalization 
extensions add recodings for these non-ASCII characters. It would 
be good to know exactly which chars supported in the data model, 
so we can hack the code that supports that data to match.
BrianH:
23-Feb-2012
When last I checked, R3 considers all chars over 127 to be word-chars. 
It is considered to be non of REBOL's business whether a printer 
or display would show the character, so that even includes the additional 
Unicode space and control characters beyond ASCII. R3 has a binary 
parser, you see.
Maxim:
23-Feb-2012
AFAICT  it's part of the datatype... since a space will go back and 
forth when you go to/from URL! and other types like string

(in R2 at least):
>> to-url "gogo://a.com/space here"
== gogo://a.com/space here
>> to-string gogo://a.com/space here
== "gogo://a.com/space here"
Steeve:
23-Feb-2012
Brian, Can you show me what is broken ? I'm a bit unsettled by your 
concern
BrianH:
23-Feb-2012
The escape decoding gets done too early. The decoding should not 
be done after until the URI structure has been parsed. If you do 
the escape decoding too early, characters that are escaped so that 
they won't be treated as syntax characters (like /) are treated as 
syntax characters erroneously. This is a bad problem for schemes 
like HTTP or FTP that can use usernames and passwords, because the 
passwords in particular either get corrupted or have inappropriately 
restricted character sets. IDN encoding should be put off until the 
last minute too, once we add support for Unicode to the url handlers 
of HTTP, plus any others that should support that standard.
Maxim:
23-Feb-2012
yep... and I've lost hours trying to get some ftp code to work because 
it had strange urls (with passwds)... which the interpreter would 
break all the time. 

At some point you are mystified by what is the actual URL being sent 
to the server.


once you see what is going on, you can get it to work, but realizing 
that you didn't actually send the url you expect, can take quite 
a long time to realize and properly fix once you've got a whole app 
expecting/playing with urls.
BrianH:
23-Feb-2012
I've been hoping to fix that. I can load a hot-patch into R2, and 
include a patch in a host kit build in R3 or replace functions from 
%rebol.r if necessary.
Steeve:
23-Feb-2012
Ok I try to resume our concern.

The url! and email! syntax is more permissive than a valid URI. It's 
not a problem nor a design flaw.

The escape decoding should not be done at all when decoded as a part 
of an url! or email!. Right, but it will not be corrected until Carl 
does it.

DECODE-URL can be rewritten (used by schemes). The parser is too 
strict and can't deal with complex forms.
Steeve:
23-Feb-2012
In the %* form, R3 should recognise the ^ char as a normal char (not 
one escaping notation) as R2 does.
BrianH:
23-Feb-2012
Worse than being a huge mess, R2 and R3 have different messes. R2 
MOLD fails to encode the % character properly. R3 chokes on the ^ 
character in unquoted mode, and allows both ^ and % escaping in quoted 
mode, and MOLDs the ^ character without encoding it (a problem because 
it chokes on that character). Overall the R2 MOLD problem is worse 
than all of the R3 problems put together because % is a more common 
character in filenames than ^, but both need fixing. I wish it just 
did one escaping method for files, % escaping, or did only % escaping 
for unquoted files and only ^ escaping for quoted files. % escaping 
doesn't support Unicode characters over 255, but no characters like 
that need to be escaped anyways - they can be written directly.
64501 / 6460812345...643644645[646] 647