[REBOL] Re: UTF-8 revisited
From: rotenca:telvia:it at: 27-Nov-2002 16:13
Hi Jan,
perhaps this can help:
udata: [1 2 [0 192 224 240 248 252]]
table: reduce [
func[u][first u]
func[u][either 0 < first u [0 + (second u) + (256 * first u)][second u]]
3
func[u][to integer! to binary! u]
]
encode: func [
{
Encode string of k-wide characters into UTF-8 string,
where k: 1, 2 or 4.
Case k = 1 could have been isolated for much
improved speed.
(integer -> string -> string)
}
k [integer!]
ucs [string!]
/local c f x m result [string!]
][
result: make string! length? ucs
f: pick table k
parse/all ucs [
any [
c: k skip (
either 128 > x: f c [insert tail result x][
either x < 256 [
insert insert tail result x / 64 or 192 x and 63 or 128
][
result: tail result
m: 1
while [x > 127][
insert result to char! x and 63 or 128
x: x and -64 / 64
m: m + 1
]
insert result to char! x or udata/3/:m
]
]
)
]
]
head result
]
---
Ciao
Romano