Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

[REBOL] Re: Hungarian Alphabet Sort (was Re: Collation sequence - proper and eff

From: nitsch-lists:netcologne at: 15-May-2002 18:58

Hi Carl, Scott, Gesa, Am Mittwoch, 15. Mai 2002 14:30 schrieb Carl Read:
> On 15-May-02, G. Scott Jones wrote: > > From: "Volker Nitsch" > > ... > > > >> not sure if this helps, but since i spended some time to it, > >> i post ;) > > > > <snipped code> > > > > Hi, Volker, > > > > Neat idea. Kind of like a good cut of beef, I'm going to have to > > chew on it a bit to fully understand its potential. Thanks for the > > trans-atlantic volley ball pass. > > Glad you could work it out, as I couldn't make head nor tail of it. (: >
Carl, my volley pass works similar to yours, except i made it more complicated :) your pattern-rule "aAbBcC" would look like [+ "a" + "A" + "b" + "B" + "c" + "C"] because i use blocks with strings, i can also map multi-char-codes. so [+ "CH"] maps to one char. IIRC "CH" is handled like one char? also i have [+ "CH" = "Ch"]. this says, "CH" and "Ch" are the same. (add a new code-number for "CH" and use the same number for "Ch"). And in my telephone-book "&ouml;" is handled like "oe", so one char expands to two. So i need this kind of commands? (Scott, i found no "ss" in this book, because "ss" has always two chars before it and is rarely used. sorry.. http://www.uni-koeln.de/phil-fak/spinfo/lehre/java/kap23/collating.htm#3.3 duden 73: "&szlig;" like "ss", by same words before (argh!) since 96 changed: "&szlig;" after "ss". and "&auml;" the same as "a". telephonbook is wrong? or duden? hmm.. back to script.) ) first the block is initialized with ascii-codes [.. + "@" + "A" ..] then i could move whole char-blocks around, to mix "aAbB". then comes customize-ascii: [ at "h" [+ "ch"] at "H" [+ "CH" = "Ch"] ] which says {find "h" in block and insert[+ "ch"] behind},same for "H". now i have [.. + "h" + "ch" + "i"]. in a second pass i give numbers to the strings in tis order, in a third i create the parse-rule, which translates a string to the sort-encoding. for sorting i mix strings and their translations like [translation1 string1 translation2 string2] sort with sort/skip 2, and extract the strings back to the original block. hmm, somehow i like your string more. if it could deal with multi-chars.
> Anyway, I've played around with my idea for sorting according to a > pattern, and while I'm not sure if the following code's very fast (or > bug-free:), like Volker, I post. >
Good idea :-)
> There's two functions: One to take a pattern for creating a rule from > and another to use the rule to sort strings or blocks of strings > with. First, the functions... > > pattern-rule: func [ > "Create a rule for use by pattern-sort." > pattern [string! block!] "An ordered pattern." > /local rule n > ][ > rule: copy [] > n: 1 > forall pattern [ > append rule reduce [pattern/1 to-paren reduce ['r n] '|] > n: n + 1 > ] > append rule reduce ['skip to-paren reduce ['r n]] > reduce ['some rule] > ] > > pattern-sort: func [ > {Sort a string or block of strings based on a rule created > by pattern-rule.} > series [string! block!] "Series to sort." > rule [block!] "Pattern rule." > /reverse "Reverse sort order." > /local ptrs blk r pos val > ][ > ptrs: copy [] > blk: copy [] > r: func [n][append/only blk n] > bind rule 'r > either string? series [ > parse/case series rule > pos: 1 > foreach n blk [ > append/only ptrs reduce [ > n pick rule/2 (n - 1) * 3 + 1 > ] > val: next first back tail ptrs > if 'skip = val/1 [change val pick series pos] > pos: pos + either char? val/1 [1][length? val/1] > ] > ][ > forall series [ > clear blk > parse/case series/1 rule > append/only ptrs copy blk > append last ptrs series/1 > ] > ] > either reverse [sort/reverse ptrs][sort ptrs] > clear series > forall ptrs [append series last ptrs/1] > series > ] > > And some examples of use... > > >> rule-1: pattern-rule "aAbBcC" > > == [some [#"a" (r 1) | #"A" (r 2) | #"b" (r 3) | #"B" (r 4) | #"c" (r > 5) | #"C" (r 6) | skip (r 7)]] > > >> pattern-sort "AacCBb" rule-1 > > == "aAbBcC" > > >> pattern-sort ["Abc" "abc" "aBC" "ABC"] rule-1 > > == ["abc" "aBC" "Abc" "ABC"] > > >> pattern-sort/reverse ["Abc" "abc" "aBC" "ABC"] rule-1 > > == ["ABC" "Abc" "aBC" "abc"] > > >> rule-2: pattern-rule "AaBbCc" > > == [some [#"A" (r 1) | #"a" (r 2) | #"B" (r 3) | #"b" (r 4) | #"C" (r > 5) | #"c" (r 6) | skip (r 7)]] > > >> pattern-sort "AacCBb" rule-2 > > == "AaBbCc" > > >> pattern-sort ["Abc" "abc" "aBC" "ABC"] rule-2 > > == ["ABC" "Abc" "aBC" "abc"] > > >> rule-3: pattern-rule ["a" "A" "b" "B" "ch" "c" "C"] > > == [some ["a" (r 1) | "A" (r 2) | "b" (r 3) | "B" (r 4) | "ch" (r 5) | > "c" (r 6) | "C" (r 7) | skip (r 8)]] > > >> pattern-sort "abcABCchCbA" rule-3 > > == "aAAbbBchcCC" > > >> pattern-sort ["AabA" "chab" "chAB" "cchc" "achA"] rule-3 > > == ["achA" "AabA" "chab" "chAB" "cchc"] > > It seems to work and might be of some use, but I'd test it well before > trusting it. It's had no real-world tests at all...
greetings volker