• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r4wp

[#Red] Red language group

BrianH
10-Nov-2012
[3505]
For instance, users converting character encodings to Unicode, encodings 
like UTF8 or national encodings.
DocKimbel
10-Nov-2012
[3506]
Red should provide an UTF-8 codec. For national encodings, we would 
probably proceed by offering  on-demand online codecs for the most 
used ones. That could be a shared resource with R3.
BrianH
10-Nov-2012
[3507x2]
I was talking about the codecs. They need to be written too, right? 
:)
Sorry if I missed the answer to this, but are you going to be doing 
a UTF8 binary parser for Red's source the way that R3 does for its 
source? Rather than a Unicode string parser, which processes the 
source after it's been through a codec?
DocKimbel
10-Nov-2012
[3509x2]
Yes, that's the "runtime lexer" in Red's roadmap. It is required 
for implementing LOAD (or TRANSCODE if you prefer).
BTW, we already have a UTF-8 binary parser in the Red compiler.
BrianH
10-Nov-2012
[3511]
I do prefer, actually. LOAD being mezzanine and calling a separate 
parser lets you do a lot of nice tricks. The "mezzanine" might be 
native in Red, but the separation of concerns is still a value. YMMV 
of course.
Andreas
10-Nov-2012
[3512]
Agreed. TRANSCODE is a rather unelegant name, though :)
BrianH
10-Nov-2012
[3513]
Agreed. The weird set of options turned out to be essential though. 
Every combination of options is used in LOAD at different points. 
We even need a /part option like that of DECOMPRESS/part, for the 
same reason.
DocKimbel
10-Nov-2012
[3514]
Agreed too for the separation and for the bad sounding name. ;-)
BrianH
10-Nov-2012
[3515]
Worse than being bad-sounding, it's a really general term being applied 
to a really specific operation. That name could have been used for 
a more general codec-based transformation process.
Jerry
15-Nov-2012
[3516]
Should Red/System series be one-based? There are some discussion 
on it. Why not set a #pragma for it, so programmers can set it themselves?
Pekr
15-Nov-2012
[3517]
I would say one based, but then other ppl reappear, which will claim 
R/S is not for me, but for C coders, so they will imo align it to 
zero :-)
Kaj
15-Nov-2012
[3518x2]
I've become a low level programmer again, and I still want it to 
be one based
A pragma would be the worst of both worlds: not making a decision, 
like most other software out there
DocKimbel
15-Nov-2012
[3520]
I agree that we should have only _one_ convention, else it will quickly 
become a nightmare when having to integrate 3rd-party code. We need 
to find some objective reasons for choosing it. 


For Red, I'm inclined to continue on the one-based convention that 
worked pretty well in R2 for many years (at least for me). I'm not 
very fond of the change in R3, introducing 0-based convention implicitly, 
it solves one problem (iterating over 0 index...I don't remember 
ever doing that), but introduces new ones (negative indexes point 
now to an IMHO, counter-intuitive position which will most probably 
lead to programming errors). For now, I prefer to stick to R2 way, 
until  we find a better solution (feel free to propose some on related 
github tickets or here). For example, we could decide to ban indexes 
<= 0 (not my favorite personal option though, but would solve simply 
the problem).


For Red/System, a 0-based convention might make more sense, but it 
would push us into the R3 issue I've mentioned above wrt indexes 
<= 0. Also, as a dialect of Red, it can use whatever convention best 
fits its purpose, but OTOH, having the same convention as Red would 
help. So, I'm really undecided for Red/System.


I think the whole issue boils down to decide about PICK behavior 
with <= 0 indexes, everything else should be able to fit in easily 
once that preliminary question is solved. It would be helpful if 
someone could put up everything related to this topic on a wiki page 
with all arguments sorted (there's a lot of them in R3 group posted 
a few weeks ago).
Kaj
15-Nov-2012
[3521x4]
Agreed
Off-by-one errors are everywhere in programming. Choosing between 
one-based and zero-based indexing shifts them to slightly different 
places, but they will still be there. As you said, I seldomly encounter 
a situation where there would be a strong preference for indexes 
to be zero based
One-based is human friendly, while zero-based is usually more machine 
friendly, so I think REBOL made the right choice
By extension, I would like Red/System to be as close to Red as possible, 
so issues can be explained firmly and just once, and it's easy to 
morph Red code into Red/System code when you decide you need the 
performance
Pekr
15-Nov-2012
[3525]
Agreed with last Kaj's remark ....
Endo
15-Nov-2012
[3526]
+1
Andreas
15-Nov-2012
[3527x3]
Being a human myself, I don't find indices-as-ordinals ("one-based") 
particularly human friendly.
For Red/System, a 0-based convention might make more sense, but it 
would push us into the R3 issue I've mentioned above wrt indexes 
<= 0.


With indices-as-offsets ("0-based"), there really is no issue with 
indices <= 0.
As for R3, it did not really introduce "0-based convention implicitly", 
it still is firmly "1-based" in as far as the first element in a 
series can be accessed using index 1.


When you want indices-as-ordinals, you really need to decide: (a) 
is the ordinal "zeroth" meaningful, and if so, what it means; (b) 
are negative indices meaningful, and if so, what they mean.


R3 went with the choices of (a) having meaningful zeroth, defined 
as "the item in a series before the first item", and (b) allowing 
negative indices, having index -1 as the immediate predecessor of 
index 0.


R2 went with the choice of (a) not having a meaningful zeroth, but 
instead of erroring out, functions (pick) & syntax (paths) accepting 
indices are lenient: passing an index of 0 always returns NONE. For 
(b), R2 allows negative indices and defines -1 as the immediate predecessor 
of 1.
DocKimbel
15-Nov-2012
[3530]
Andreas: thanks for the good sum up.


R3: agreed that index 1 is still the first element in a series, but 
index 0 is allowed and there is this ticket #613 that clearly aims 
at introducing 0-based indexing in R3...so my guessing was these 
different changes or wishes were inter-related. http://curecode.org/rebol3/ticket.rsp?id=613


R2: I would have really prefered that index 0 raises an error than 
returning none.
Andreas
15-Nov-2012
[3531]
If you wish to allow index computation for series not positioned 
at the head, allowing index 0 is actually quite sensible, unless 
you want to make index computation particularly error prone.
DocKimbel
15-Nov-2012
[3532]
There is also the option proposed by Gabriele to consider: an ordinal! 
datatype (...-2th, -1th, 1st, 2nd, 3rd, 4th,...).


It could solve the whole thing, but I see two cons about this option: 

1) negative ordinals look odd, I don't even know if they can be read 
in engllish?

2) code would be more verbose as it will need conversions (to/from 
ordinals) in many places.


In addition to the pros, making  a difference between an integer 
and an ordinal might help improve code readability.
Andreas
15-Nov-2012
[3533]
The problem with no meaningful index 0 is that potentially meaningful 
index values are no longer isomorphic to integers. And as REBOL has 
no actual datatype for indices, all we can compute with are integers 
while relying on a correspondence of those integers to indices.


If you only ever compute indices for series positioned at the head, 
you get a nice correspondence of integers to indices, because meaningful 
indices for this series correspond to the positive integers.


But if you also want to compute indices for series positioned elsewhere, 
this nice integer-to-index correspondence breaks down as you suddenly 
have an undefined "gap" for the integer 0, whereas negative integers 
and positive integers are fine.
Kaj
15-Nov-2012
[3534]
Yes, ordinal! would fix that, or the index! I proposed earlier
Andreas
15-Nov-2012
[3535]
I also think that an ordinal! (or index!) datatype may be an intriguing 
possiblity to get the best of both worlds.
DocKimbel
15-Nov-2012
[3536]
Andreas: do you have a short code example involving index 0 in computation? 
I don't remember ever having issues with index 0 and I use series 
with offsets a lot! Though, Ladislav claims he and Carl did encounter 
such issue at least once...the use cases for this issues remain a 
mystery well kept by Ladislav. ;-)
Andreas
15-Nov-2012
[3537x2]
I personally avoid computing with non-head positioned series wherever 
possible.
So sorry, I don't have a particular example at hand, but I can easily 
imagine it coming up with e.g. forall or forskip and trying to access 
previous values in an iteration.
Ladislav
15-Nov-2012
[3539]
Though, Ladislav claims he and Carl did encounter such issue at least 
once

 - I am claiming that I have revealed a bug in Carl's code caused 
 by the fact that indices are not isomorphic to integers, i.e. they 
 "contain a gap". That is a totally different issue than whether indexing 
 should be 1-based or 0-based.
DocKimbel
15-Nov-2012
[3540]
Ladislav: I think it is relevant to this topic as findind out if 
the index 0 gap is a real practical issue or not, could help decide 
about the indexing base.
Ladislav
15-Nov-2012
[3541x6]
Then: gap is a practical issue, causing bugs.
(no matter whether indexing is 0-based or 1-based)
That is caused by the fact that there is no gap in the series, the 
gap is only caused by "unreasonable thinking".
Also, if we define 0-gap (possible) then we do not have any right 
to use negative indices.
, i.e. the gap needs to be prolonged to infinity, otherwise we are 
simply inconsistent
I'm inclined to continue on the one-based convention that worked 
pretty well in R2 for many years
 - actually, R2 is "hybrid", since SKIP is zero-based, in fact.
DocKimbel
15-Nov-2012
[3547]
Can you define "unreasonable thinking"?
Andreas
15-Nov-2012
[3548]
Have a look at the following illustration:
https://gist.github.com/5af73d4ecf93ac94680a
DocKimbel
15-Nov-2012
[3549]
SKIP works with offsets only, it's not related to indexing.
Andreas
15-Nov-2012
[3550]
About the only somewhat reasonable use I can come up with for R2's 
behaviour is that to allow writing literal -1 indices in paths (values/-1) 
to access the value preceding the current position.
Ladislav
15-Nov-2012
[3551x3]
Yes, I can in this case: "unreasonable thinking" here is the fact 
that the "mathematical model" - in this case the numbering of positions 
in series differs substantially from the properties of the object 
it is modelling - in this case there is a difference between the 
"no-gap in the series" versus "gap in the mathematical model".
Also, there is one more mathematical inconsistency: if I "hate zero", 
I simply cannot use negative numbers, otherwise I am being inconsistent.
SKIP works with offsets only, it's not related to indexing.

 - that is not true, in fact. It *is* related to indexing, since we 
 may always use PICK SKIP SERIES N M versus PICK SERIES K and these 
 things are realted, like ir or not.
DocKimbel
15-Nov-2012
[3554]
 in this case the numbering of positions in series differs substantially 
 from the properties of the object it is modelling


Is this again the "inbetween position" vs values counting intepretation 
difference?