Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: utf8-encode

From: oliva::david::seznam::cz at: 7-Jun-2002 0:04

R> nobody has utf8-encoder? R> probably will have to write one by myself ok... why there is NO native right/left shift function in Rebol?! here is how utf8 works: putwchar(c) { if (c < 0x80) { putchar (c); } else if (c < 0x800) { putchar (0xC0 | c>>6); putchar (0x80 | c & 0x3F); } else if (c < 0x10000) { putchar (0xE0 | c>>12); putchar (0x80 | c>>6 & 0x3F); putchar (0x80 | c & 0x3F); } else if (c < 0x200000) { putchar (0xF0 | c>>18); putchar (0x80 | c>>12 & 0x3F); putchar (0x80 | c>>6 & 0x3F); putchar (0x80 | c & 0x3F); } } and here is my Rebol version: rebol [ title: "UTF-8 encode" purpose: {Encodes the string data to UTF-8} author: "oldeS" email: [oliva--david--seznam--cz] date: 7-Jun-2002/0:03:27+2:00 usage: {
>> utf8-encode "czech chars: ìšèøžýáíé"
== "czech chars: ìšèøžýáíé"} comment: {More info: http://czyborra.com/utf/ } ] shift: func [ "Takes a base-2 binary string and shifts bits" data [string! binary!] places [integer!] /left /right ][ data: enbase/base data 2 either right [ remove/part tail data negate places data: head insert/dup head data #"0" places ][ remove/part data places insert/dup tail data #"0" places ] return debase/base data 2 ] utf8-encode: func[ "Encodes the string data to UTF-8" str [any-string!] "string to encode" /local c ][ str: to-binary str forall str [ if #{79} < c: to-binary to-char first str [ remove str insert str join (#{c0} or shift/right c 6) (c and #{3F} or #{80}) str: next str ] ] to-string head str ]