Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Hungarian Alphabet Sort (was Re: Collation sequence - proper and eff

From: geza67:freestart:hu at: 13-May-2002 20:29

Hello Scott!
>> The right order for Hungarian vowels: actually the diaresis characters > This was easy to fix.
... as you have prospectively pointed it out in your first post :-)
> Time to go back to the drawing board. I already have an idea, but it may > take a while before I have some time to create the new algorithm.
Good luck to "braining out" the new enhanced algorithm. :-)
> There end up being two issues at work here. Having the order as > aáAÁ...eéEÉ... > was not my intention. What I was aiming to do was > aá..eé..AÁ..EÉ... > which may also not seem correct to you; however, this behavior mirrors
Ah, so! No, this is quite right: small letters first , then capitals. I just thought you were aiming at an "interwoven" collation sequence.
> REBOL's default behavior for the /case switch, but does differ in placing
REBOL seems (more and more to me) English-oriented which is very peculiar, Carl being a German fellow (do I know it right?) Has he forgotten the handling of his native language special characters - like the German-only a-umlaut ? ;-)
> the little letters before the capital letters. Petr K. said that this was > the more normal method in eastern europe (Czech language in his case). So I
It is the normal method in Hungarian, as well.
> letter is capital or not. In fact, REBOL places all the words that begin in > capital letters _before_ the words that begin in small letters (because of > the ascii number assigned to the letters).
The problem is - IMHO - that REBOL does not allow _really_ custom sorts: although one can write a /compare refinement function but this refinement is not so general-aimed as it seems first. Maybe mathematicians can use custom comparisons for e.g. complex numbers, but the refinement can not easily accomodated to to custom-order series values, as it is in the case of strings. Specifying collation order for strings is the first step to internationalization. Being Europe a huge and linguistically not homogenous market, RT should adopt a "plugin"-style localization: the 'locale object seems to be a right place to this, i.e. putting custom collation sequences there.
> Maybe we need an additional switch that allows for the eastern european > desire to have smalls before capitals, and to interleave these together as
Maybe I missed this in the English class :-) but does NOT sort English this way, too? What is the proper sorting order for mixed capitalized English words?
> you suggest. Sometimes it would be handy to have these options too here in
On what occasion do you think it would be necessary for you (disregarding the special cases for writing custom softwares for Eastern Europe ;-) ) ?
> the US. Just need a clever name or names for these switches (or paths in > REBOLese). Any ideas are welcomed.
The most obvious (and highly uninspired ;-) ( naming would be: /international. Other ideas: /smallsfirst /capitalized
>> - and so on for all affected special accented chars. > and so on for life in general! > :-)
Do not stop generalization here: Life, Universe and everything ... :-))
> I'll repost after I have a chance to develop the new algorithm that I have > in mind. "Stay tuned"
Beep-beep :-) -- Best regards, Geza mailto:[geza67--freestart--hu]