AltME groups: search
Help · search scripts · search articles · search mailing listresults summary
world | hits |
r4wp | 32 |
r3wp | 11 |
total: | 43 |
results window for this page: [start: 1 end: 43]
world-name: r4wp
Group: #Red ... Red language group [web-public] | ||
DocKimbel: 24-Sep-2012 | Yes, Latin-1 / UCS-2 / UCS-4 | |
DocKimbel: 8-Nov-2012 | A series buffer has header, with OFFSET and TAIL pointers that define respectively the begin and end of series slots. The OFFSET pointer allow to reserve space at head of the series for optimizing insertions at head. Series slots size can be 1 (binary/UTF-8/Latin-1), 2 (UCS-2), 4 (UCS-4) or 16 (value!) bytes wide. | |
PeterWood: 27-Mar-2013 | Actually, Gregg's test works under OS X: Schulz:Red peter$ ./console -=== Red Console alpha version ===- (only Latin-1 input supported) red>> s: copy "" == "" red>> append/dup s #" " 10 == " " red>> length? s == 10 | |
DocKimbel: 17-Apr-2013 | BTW, if you stick to Latin-1, you shouldn't have the need for any conversion? | |
DocKimbel: 17-Apr-2013 | From a routine, if str is a red-string! pointer, this is the dispatch code you would need to use: s: GET_BUFFER(str) switch GET_UNIT(str) [ Latin-1 [...conversion code...] UCS-2 [...conversion code...] UCS-4 [...conversion code...] ] | |
Kaj: 26-Apr-2013 | I found out that not only does Red not support Unicode, it doesn't support Latin-1, not even on Windows | |
Kaj: 26-Apr-2013 | Both the compile time and runtime lexers don't let Latin-1 through | |
Kaj: 26-Apr-2013 | -=== Red Console alpha version ===- (only Latin-1 input supported) red>> s: "Español" == "Espa" red>> length? s == 4 | |
Kaj: 26-Apr-2013 | A very similar thing happens when I paste Latin-1 into Windows | |
Kaj: 26-Apr-2013 | Here's what happens when I try to compile Latin-1 source code: | |
DocKimbel: 26-Apr-2013 | You can't paste UTF-8 in the console, it supports only Latin-1. | |
Kaj: 26-Apr-2013 | Yes, so the Latin-1 promise is false | |
Kaj: 26-Apr-2013 | You can't paste Latin-1 | |
DocKimbel: 26-Apr-2013 | Are you sure you're pasting Latin-1 and not UTF-8? | |
Kaj: 26-Apr-2013 | string/load can only load UTF-8, so only ASCII and UTF-8 files can be read, not Latin-1 | |
Kaj: 26-Apr-2013 | For example, there's an internal single byte encoding that's marked "Latin1", but I now know there is no way to get Latin-1 data in or out, so I wonder if this encoding will ever be used for more than 7-bit ASCII | |
Kaj: 26-Apr-2013 | Actually, I did one test that confirms Andreas' statement. The only way to get 8-bit data in is to compile a UTF-8 string literal that fits into Latin-1 | |
Kaj: 26-Apr-2013 | No, the console says you can input Latin-1, and you can't, not even through UTF-8 | |
Kaj: 26-Apr-2013 | Neither can you compile Latin-1 nor read Latin-1 files nor other source data | |
Kaj: 26-Apr-2013 | -=== Red Console alpha version ===- (only Latin-1 input supported) red>> s: "Español" == "Espa" red>> length? s == 4 | |
Kaj: 26-Apr-2013 | Yes, and neither works, so there is no Latin-1 support at all, except in a corner case internally | |
Kaj: 26-Apr-2013 | Yes, and as you say, it's mislabeled Latin1, so there were several thing leading me to believe that Red already had Unicode and Latin-1 support | |
Kaj: 26-Apr-2013 | Yes, I think it's very dangerous to claim that Red has Unicode and Latin-1 support | |
DocKimbel: 28-Apr-2013 | 1) "I found out that not only does Red not support Unicode, it doesn't support Latin-1, not even on Windows" Red *does* "support" Unicode, Latin-1 "support" was not claimed in Red, except for the console script. I've put quotes around support word, because you're mixing up internal representation and I/O encoding formats. | |
DocKimbel: 28-Apr-2013 | 5) "Yes, I think it's very dangerous to claim that Red has Unicode and Latin-1 support". Red *has* Unicode support, string! and word! value support Unicode, input Red scripts are Unicode, PRINT outputs Unicode characters. Latin-1 is used as an *internal* encoding format, I don't remember ever claiming that "Red supports Latin-1 for I/O" except for the console script (which is wrong, I agree). OTOH, I do remember thinking about supporting it at the beginning for printing, then I found it cumbersome to support in addition to Unicode mode and dropped it during the implementation. | |
DocKimbel: 28-Apr-2013 | So, about the console issue, the runtime lexer is able to parse Latin-1 input but the input string gets internalized before being passed to the lexer using the UTF-8 loader, which chokes on MSDOS console incompatible codepages. For the Unix version, the console input being in UTF-8 by default, it passes the internalization, but crashes the runtime lexer. | |
DocKimbel: 28-Apr-2013 | Kaj, it seems to me that you were confused by a few things: - console script banner wrong statement (my fault) - internal "Latin-1" naming (like in Python's internals) which might be misleading (there's no other closer naming in Unicode for one byte representation AFAIK, though some people call it "UCS-1", maybe we should adopt that too). - "Unicode support" seems to imply to you that *all* possible Unicode encodings have to be supported (with encoders/decoders). It doesn't, having just one encoding supporting the full Unicode range (like UCS-4) is enough for claiming "Unicode support". | |
Arnold: 29-Apr-2013 | There is as I read this a different issue. Dock want Red to be as complete as posible, Kaj wants it to officially useable. Kaj really needs UTF-8 (and or Latin-1) character support, for getting this, I guess this has to do with the Syllable operating system amongst others. I would like Red to support time and random functions as natives and (Gregg is one of your mezz funcs REJOIN ? I want that too) be able to connect to a MySQL database so I can dump PHP for some webdevelopment. Besdies that we all love to see a VID (like) solution for display and creating apps. We have to be patient agreed 100% amongst everybody? Where the roadmap mentions all things to progress Red, above things are not on that list. I want Red to have enough to make it useable in production and after that expand, imho that is the way to really attrackt more funding/enthousiast programmers and make sure current support does not fade/ loose interest. | |
Group: Announce ... Announcements only - use Ann-reply to chat [web-public] | ||
Kaj: 8-Jan-2013 | The Red binding uses the same string marshalling as the Red console, so the same limitations apply. Latin-1 only and string values of 64 bytes or longer may not work | |
Kaj: 27-Apr-2013 | I implemented UTF-8 output support for Red. I ended up writing optimised versions based more on the Red print backend. I integrated them in my I/O routines and made heavy performance optimisations. Thanks to Peter for leading the way. There are the following Red/System encoders embedded in %common.red: http://red.esperconsultancy.nl/Red-common/dir?ci=tip to-UTF8: encodes a Red string into UTF-8 Red/System c-string! format. to-local-file: encodes a Red string into Latin-1 Red/System c-string! format on Windows, and into UTF-8 on other systems. This yields a string suitable for the local file name APIs. Latin-1 can be output as long as it was input into Red via UTF-8. Non-Latin-1 code points cannot be encoded in Latin-1 and yield a NULL for the entire result. These encoders make use of the Latin1-to-UTF8, UCS2-to-UTF8 and UCS4-to-UTF8 encoding functions. An example of their use in the Red READ and WRITE functions is in %input-output.red | |
Kaj: 27-Apr-2013 | I used the new encoding functions in all my Red bindings: those for the C library, input/output via files and cURL, 0MQ, SQLite and GTK+. In as many places as possible, data marshalled to the external libraries now supports UTF-8. File names on Windows support Latin-1. Files and URLs are always read and written as UTF-8, including on Windows. Red does not support loading Latin-1 strings. | |
Kaj: 27-Apr-2013 | I've updated the binary downloads. The red console interpreters and all the Red examples include the above encoding support now, and all the latest Red features: http://red.esperconsultancy.nl/Red-test/dir?ci=tip For example, the Red/GTK-text-editor now supports writing UTF-8 files with UTF-8 or Latin-1 names. I've added an MSDOS\Red\red-core.exe for Windows 2000, because the GTK+ libraries in red.exe require Windows XP+. |
world-name: r3wp
Group: Core ... Discuss core issues [web-public] | ||
DanielSz: 14-Nov-2007 | BTW, I noticed that rebol.org serves pages in utf-8 encoding, but the scripts themselves are latin-1. This is not a problem for the code, but it is a problem for the comments, which may contain accented characters. For example, names of authors (hint: Robert Müench), and they consequently appear garbled. I'm not saying pages should be served as latin-1, on the contrary, I am an utf-8 enthusiast, I think rebol scripts themselves should be encoded as utf-8, (it is possible with python, for example). I hope Rebol3 will be an all encompassing utf-8 system (am I dreaming?). | |
BrianH: 5-Mar-2009 | kib2: "Does that mean that we can use unicode encoding with the help of r2-forward ?" No, I only can only spoof datatypes that don't exist in R2, and R2 has a string! type. The code should be equivalent if the characters in the string are limited to the first 256 codepoints of Unicode (aka Latin-1), though only the first 128 codepoints (aka ASCII) can be converted from binary! to string and have the binary data be the same as minimized UTF-8. | |
BrianH: 30-Jan-2010 | latin1?: func [ "Returns TRUE if value or string is in Latin-1 character range (below 256)." value [string! file! email! url! tag! issue! char! integer!] ; Not binary! ][ ; R2 has Latin-1 chars and strings either integer? value [value < 256] [true] ] ; Note: Native (and more meaningful) in R3. For forwards compatibility. | |
Group: Script Library ... REBOL.org: Script library and Mailing list archive [web-public] | ||
Sunanda: 16-Mar-2009 | Thanks guys. Other scripts with the same problem.....there are a couple. About 10% of all scripts have at least one extended ASCII char....But most of them are acceptable in LATIN-1 code page / charset (eg copyright symbol, some accented letters). It's just a very few scripts that use 1/4 and similar symbols that cause the problem. What other editors? Windows NOTEPAD is one example of a common one that gets this wrong. | |
Group: !REBOL3-OLD1 ... [web-public] | ||
Sunanda: 31-Jul-2009 | But it's R2 compatible :) There are other edge cases -- Latin-1 chars that can be _in_ a word not not _start_ them, and do not serialise well.....I did a script and found them all once | |
Maxim: 11-Sep-2009 | but if we need to output latin-1 afterwards (while dumping the html content, for example), the output encoding should be selectable as a "current default", and all the --cgi would do is set that default to UTF-8 for example. | |
Maxim: 30-Oct-2009 | I also think the "default" user text format should be configurable. I have absolutely no desire to start using utf-8 for my code and data, especially when I have a lot of stuff that already is in iso latin-1 encoding. | |
Maxim: 30-Oct-2009 | hum... cause everything I use is ascii or latin-1 ? | |
Group: !REBOL2 Releases ... Discuss 2.x releases [web-public] | ||
BrianH: 2-Jan-2010 | OK, now that we have 2.7.7 released (even though there is more work to do, i.e. platforms and the SDK), it is time to look ahead to 2.7.8 - which is scheduled for release in one month on February 1. The primary goal of this release is to migrate to REBOL's new development infrastructure. This means: - Migrating the RAMBO database to a new CureCode project and retiring RAMBO. - Using Carl's generation code for the manual to regenerate the R2 manual, so we can start to get to work updating it. - Porting the chat client to R2 using the new functions and building a CHAT function into R2 similar to the R3 version. The R2 chat client might be limited to the ASCII character set, though support for the Latin-1 character set might be possible. Still text mode for now, though if anyone wants to write a GUI client (Henrik?) we can put it on the official RT reb site accessible from the View desktop. The server is accessed through a simple RPC protocol and is designed to be easily scriptable. It turns out that Carl already rewrote the installer for 2.7.something, but it was turned off because of a couple minor bugs that we were able to fix in 2.7.7. With any luck, only minor fixes to the registry usage will be needed and we'll be good to go. As for the rest, it's up to you. Graham seems to have a good tweak to the http protocol, and others may want to contribute their fixes. | |
Group: !REBOL3 Extensions ... REBOL 3 Extensions discussions [web-public] | ||
Oldes: 11-Nov-2010 | So with Cyphre's help I have this function: char* rebser_to_utf8(REBSER* series) { char *uf8str; REBCHR* str; REBINT result = RL_GET_STRING(series, 0 , (void**)&str); if (result > 0){ //unicode string int iLen = wcslen(str); int oLen = iLen * sizeof(REBCHR); uf8str = malloc(oLen); int result = WideCharToMultiByte(CP_UTF8, 0, str, iLen, uf8str, oLen, 0, 0); if (result == 0) { int err = GetLastError(); RL->print("err: %d\n", err); } } else if (result < 0) { //bytes string (ascii or latin-1) uf8str = malloc(strlen((char *)str)); strcpy(uf8str, (char *)str); } return uf8str; } and I can than use: .. char *filename = rebser_to_utf8(RXA_SERIES(frm, 1)); status=MagickReadImage(current_wand, filename); free(filename); if (status == MagickFalse) { ThrowWandException(current_wand); } return RXR_TRUE; | |
Oldes: 11-Nov-2010 | This seems to be working: char* REBSER_to_UTF8(REBSER* series) { char *uf8str; REBCHR* str; REBINT result = RL_GET_STRING(series, 0 , (void**)&str); if (result > 0){ //unicode string int iLen = wcslen(str); //int oLen = iLen * sizeof(REBCHR); int oLen = WideCharToMultiByte( CP_UTF8, 0, str, -1, NULL, 0, NULL, NULL); uf8str = malloc(oLen); int result = WideCharToMultiByte(CP_UTF8, 0, str, iLen, uf8str, oLen, 0, 0); if (result == 0) { int err = GetLastError(); RL->print("err: %d\n", err); } uf8str[oLen] = 0; } else if (result < 0) { //bytes string (ascii or latin-1) uf8str = strdup((char *)str); } return uf8str; } |