Mailing List Archive: 49091 messages
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search

[REBOL] Re: Bitset!

From: greggirwin:mindspring at: 19-Jun-2004 6:40

Hi Patrick, PPln> it possible to know what characters are encoded in a bitset? Below is an older script by Brett Handley, which may work for you as is, or as a starting point to customize. -- Gregg ; WATCH FOR WRAP! REBOL [ Title: "Bitset tools" Date: 17-Aug-2000 File: %bitsets.r Author: "Brett Handley" Email: [brett--codeconscious--com] Purpose: { Define functions that relate to bitsets. } ] describe-bitset: function [ "Describes a bitset and returns a form suitable for charset." bitset [bitset!] /string "Return result as a string not a block. Implies /norange." /norange "Returns result without using ranges for block results." /longest "Returns the longest ranges instead of -nice- ranges." ; Instead of having multiple functions, this function uses recursion to ; implement the different functionalities. ][ result test-char range-start range-end range-spec nice-ranges tmp-charset ][ ; Norange is ignored if result should be a string. if string [ norange: true ] ; Longest is ignored if we are not going to collect into ranges. if norange [ longest: false ] ; ; Return all the characters that are included. ; if norange [ either string [ result: copy {} ][ result: copy [] ] for i 0 255 1 [ if parse/all to-string test-char: to-char i reduce [ bitset ] [ append result test-char ] ] RETURN result ] ; ; Collect characters into the longest ranges we can. ; ; Outer loop interates over each possible range. ; Inner loop test each character for inclusion in the range. if longest [ ; First get all the characters involved. result: head describe-bitset/norange bitset range-start: result while [not tail? range-start][ range-end: range-start until [ range-end: next range-end either any [ tail? range-end not equal? (offset? range-start range-end) (subtract first range-end first range-start) ] [true] [false] ] ; Test is it is a worthwhile range. if all [ greater? offset? range-start range-end 1 greater? subtract last back range-end first range-start 2 ] [ ; Replace individual characters with a range. range-spec: reduce [first range-start to-word "-" first back range-end] remove/part range-start range-end range-end: insert range-start range-spec ] range-start: range-end ] RETURN head result ] ; ; Segment the given bitset into "nice" ranges. ; result: make block! 1 nice-ranges: [ #"0" #"9" #"A" #"Z" #"a" #"z" ] ; Get our "nice" ranges first. foreach [rs re] nice-ranges [ append result describe-bitset/longest intersect bitset charset compose [(rs) - (re)] ] ; Get everything and dump it at the end. tmp-charset: charset {} foreach [rs re] nice-ranges [ tmp-charset: union tmp-charset (charset compose [(rs) - (re)]) ] append result describe-bitset/longest intersect bitset complement tmp-charset RETURN result ] describe-object-bitsets: function [ "Reports the character sets used by an object." obj [object!] ] [result field] [ result: make block! 1 foreach w first :obj [ if equal? type? field: get in obj w bitset! [ append result w append/only result describe-bitset field ] ] result ] ; Examples ; Three different ways to describe the digit charset of xml-language. ; describe-bitset xml-language/digit ; describe-bitset/norange xml-language/digit ; describe-bitset/string xml-language/digit ; ; Lets highlight the continuous ranges of the alpha-num charset of xml-language. ; describe-bitset/longest xml-language/alpha-num ; ; How many characters does the digit bitset represent? ; length? describe-bitset/norange xml-language/digit ; ; What is the difference between these two fields of xml-language? ; describe-bitset difference xml-language/data-chars xml-language/data-chars-qt2 ; ; Show me all the bitset fields of xml-language. ; foreach [bn bs] describe-object-bitsets xml-language [print [bn mold bs]] ; ; Sanity test for my function. ; equal? ( charset describe-bitset complement charset {} ) ( complement charset {} )