World: r3wp
[!REBOL3-OLD1]
older newer | first last |
Maxim 30-Oct-2009 [19292x2] | if you only use ascii (lower 127 chars) you will see no difference. |
since rebol will load files as UTF-8 by default code doesn't need it. | |
Gabriele 30-Oct-2009 [19294] | guys, please, we are in 2009... how is it possible you're still using latin1... |
Maxim 30-Oct-2009 [19295x2] | hum... cause everything I use is ascii or latin-1 ? |
well, windows ANSI.. | |
sqlab 30-Oct-2009 [19297] | Even you live in 2009, most people are born in the last century. And many recipes, are older than the people, at least the good ones. (recipes for cooking food I mean:) |
PeterWood 30-Oct-2009 [19298] | ..and sticking to the old ways means living with the old problems ... like not knowing how to interprete characters properly ... like AlrME for example ... it assumes makes the assumption that all text in messages is encoded as though it was entered on your own machine. So messages from Mac users are incorrecly displayed on Windows machines and vice-versa. For me, moving to utf-8 is a much easier problem to live with than not being able to properly share text across different platforms. It may be different for you. |
Henrik 30-Oct-2009 [19299] | REBOL3's philosophy should be simple: UTF-8 is default. Anything else is possible, but must be optionally selected. |
sqlab 30-Oct-2009 [19300x3] | Livnig with the old problems means knowing the old solutions. Displaying text in windows is a different problem to loading programs. |
By the way, Ticket #0000589 leads still to a crash, even it was set to build again. | |
build = built | |
Maxim 30-Oct-2009 [19303] | re-open it. |
PeterWood 30-Oct-2009 [19304] | Loading programs are not totally immune from encoding problems. An unlikely but possible example: if name = "Ashley TrŸter" [print "Hello Ashley"] |
sqlab 30-Oct-2009 [19305x2] | Then I would prefer, that name and the string to compare have an unicode datatype, as in >> type? name == UTF-8. |
if name = U8{Ashley ... | |
Maxim 30-Oct-2009 [19307x2] | but utf-8 editors aren't rare nowadays, and using utf-8 sequences isn't hard... really, if you tuely want to keep using as ascii editor |
tuely = truely | |
sqlab 30-Oct-2009 [19309] | But they do not convert automatically .. |
Maxim 30-Oct-2009 [19310x6] | handling encoding is complex in any environment... I had a lot of "fun" handling encodings in php, which uses such a unicode datatype... its not really easier... cause you can't know by the text if its unicode or ascii or binary values unless you tell it to load a sequence of bytes AS one or the other. |
with mashup software its even worse... when done improperly, you end up with data with multiple encodings in the same document .... and then all hell breaks loose :-) | |
at least converging to utf-8, all scripts by all authors will work the same on all systems. | |
cause there is just ONE encoding. | |
but having some kind of default for read/write could be usefull, instead of having to add a refinement all the time, and force a script to expect a specific encoding. | |
then it would be easier to change it one place, do all I/O without the refinement. and less work for another to change encoding for the whole app and having to put conditionals everytime we use read/write. | |
Pekr 30-Oct-2009 [19316] | Max - so what is an easy solution for me, to load my local scripts on my local system, which contain czech alphabet signs > 127? |
Maxim 30-Oct-2009 [19317x4] | IIRC there was intended to have a header attribute specifying encodings for the script body... |
don't know if its implemented or not. | |
I put a suggestion on the blog about allowing user-creating encoding maps... otherwise, you can load it as binary in R3 and just convert the czech chars to utf-8 multi-byte sequences and convert the binary to string using decode. | |
is the czech encoding the standard windows ansi encoding? | |
PeterWood 30-Oct-2009 [19321] | Yes on Czach machines ..... I think its Windpws codepage 1250. I beleive the default codepage on most US machines is 1252 (MS's extended version of ISO-8859-1). |
Maxim 30-Oct-2009 [19322] | ok yeah a few different diacritics between those two encodings |
Pekr 30-Oct-2009 [19323] | how do you aproach the situation, if your script would contain two strings, in different encodings? Can it practicall happen? |
Maxim 30-Oct-2009 [19324] | R3 will interpret litteral strings and decode them using utf-8 (or the header encoding, if its supported) so in this case no. but if the data is stored within binaries (equivalent to R2 which doesn't handle encoding) then, yes, since the binary represents the sequence of bytes not chars. if you use a utf-8 editor, and type characters above 127 and look at them in notepad, you will then see the UTF-8 byte sequences (which will look like garbled text, obviously). |
Pekr 30-Oct-2009 [19325] | Is there utf-8 version of notepad? :-) |
Maxim 30-Oct-2009 [19326] | I don't know if R3 has a way of specifying the encoding litterally... like UTF8{} UTF16{} or WIN1252{} ... this would be nice. |
PeterWood 30-Oct-2009 [19327] | A script cpud have two different encodings if differenlty encoded files were included. For example, you could use a script from Rebol.org in one of your scripts. You probably use Windows Code Page 1250 but most scripts in the library use other encodings. This doesn't give big problems as most of the code in the Library is "pure" ASCII |
Maxim 30-Oct-2009 [19328] | I use uedit which handles unicode natively when you want to... a lot of preferences for it ... |
PeterWood 30-Oct-2009 [19329] | Notepad can apparently handle both UTF-8 and UTF-16 http://en.wikipedia.org/wiki/Notepad_(Windows) |
Maxim 30-Oct-2009 [19330] | it tries to detect UTF based on text content... broken up until vista. http://en.wikipedia.org/wiki/Notepad_%28Windows%29 |
Carl 30-Oct-2009 [19331] | Ok, so... no one reads the wiki. That's ok... we're all developers. We don't read things other than code. So, here's a summary of R3 and Unicode: http://www.rebol.net/r3blogs/0286.html |
Gabriele 31-Oct-2009 [19332x5] | Max: maybe you should start using a real operating system. But, that aside, if you have any software that does not handle utf-8, simply trash it. guys, really, this is crazy, we are in 2009, let's put an end to this codepage crap! |
sqlab: what you say would make some sense if converting files was in any way difficult. (apart from the fact that you should have stopped using latin1 almost 10 years ago...). I've been using utf-8 with R2 for years... | |
sqlab: rigth, let's make string handling MORE COMPLEX. we definitely need that. let's copy MySQL, shall we? | |
Max: a system wide default means that my script will NOT run in the same way on your system. that is the definition of bad language design. | |
Petr: notepad, as most windows stuff, uses utf-16. much easier to detect though, and R3 could do that (actually, didn't Carl just add that recently?) most "real" editors allow you to use whatever encoding you want, and definitely support utf-8. | |
Pekr 31-Oct-2009 [19337] | Aha, I just realised that I have to use Save-as, and choose UTF-8 or Unicode, instead of default ANSI preset of notepad |
Maxim 1-Nov-2009 [19338] | gab, having a system-wide option will allow people to use the same code for different encoded source data. it doesn't break my code or yours. its a setup not a definition. |
shadwolf 1-Nov-2009 [19339x2] | i love that documentation on bitset http://rebol.com/r3/docs/datatypes/bitset.html -> I give it A+++ a you must read it documentation !! |
CArl on french forum they are wating to now if you plan to bring rebol world to the modile phone 3 generation area ? (android windowsphone or iphone). Could be a good way to show that not only adobe flash can provide things in that area i think. But they a special "set" of feature should be build in rebol to feet the particular need of those plateform. What do you thin about it Carl ? | |
Maxim 1-Nov-2009 [19341] | this is what the host code is for. |
older newer | first last |