bitset help
[1/17] from: rotenca::telvia::it at: 26-Sep-2002 16:42
I do not understand how can be used all this new stuff from the new betas. Can
someone make an useful example to me? I knows only how to use bitset in parse
rule for char, but this seems something different.
These are the RT docs:
--------------------
2.3. New Bitset Functions: CLEAR, LENGTH?, EMPTY?
Three new functions (actions) have been added to bitsets:
The CLEAR function quickly clears all bits to zero.
The LENGTH? function returns the number of bits in the bitset (always multiple
of 8).
The EMPTY? function returns TRUE if any bit is set, otherwise it returns
FALSE. (Note that EMPTY? is the same function as TAIL?; therefore, TAIL? also
returns the same results, but the word has no meaning for bitsets.)
Bitsets are used as high performance logic arrays for character sets and
hashes.
Examples:
>> items: make bitset! 40
== make bitset! #{0000000000}
>> length? items
== 40
--------------------
---
Ciao
Romano
[2/17] from: greggirwin:mindspring at: 26-Sep-2002 12:26
Hi Romano,
<< I do not understand how can be used all this new stuff from the new
betas. Can
someone make an useful example to me? I knows only how to use bitset in
parse
rule for char, but this seems something different. >>
Bitsets can be very handy if you have a really large number of boolean flags
to keep track of and want to use minimal space and processing power (in
general). The idea, of course, is that each flag only takes one bit, which
means you can store boatloads of them in little memory when compared to,
say, a normal boolean/logic! value that probably uses 32 bits (or more in
the future :).
Here is something I just hacked together very quickly (really, truly, very
little testing, probably lots of problems, you have been warned, watch for
wrap, requires latest betas...I think, blah blah blah).
--Gregg
REBOL [
Title: "Bit Tools"
Author: "Gregg Irwin"
EMail: [greggirwin--acm--org]
File: %bit-tools.r
]
bit: make object! [
; Quick hack! No real thought about naming issues. May be jumping
; through a lot of unnecessary hoops, but some things that I
; thought should work didn't. :\
; Could/should CLEAR change bitset in place?
; Should bit indexes be treated as 0 based or 1 based?
; Note XOR naming issue in comment.
; Add TOGGLE function to flip bits?
; Some functions could easily take blocks to check the status of
; multiple bits (e.g. SET?).
; This is a special version just for this purpose, and always rounds up.
; So, why pass in the interval parameter? Good question. :)
round-to-interval: func [value interval /local whole-val diff] [
whole-val: to integer! value / interval
diff: either 0 = remainder value interval [0][1]
whole-val + diff * interval
]
bitset-from-bits: func [
bits [integer! block!] "Bit indices."
/local result
][
bits: compose [(bits)]
result: make bitset! round-to-interval max 1 first maximum-of bits 8
foreach bit bits [insert result bit]
]
and: func [set1 [bitset!] set2 [bitset!]] [
intersect set1 set2
]
or: func [set1 [bitset!] set2 [bitset!]] [
union set1 set2
]
; Just using to binary! on bitsets doesn't work. You get junk
; at the end if the bitset is short than 256 bits.
; This is called BXOR because calling it XOR and then referencing
; the standard XOR function as SYSTEM/WORDS/XOR fails. Must be
; missing something obvious, but I'm not seeing it. Need more coffee.
; There has *got* to be a better way than this!
bxor: func [set1 [bitset!] set2 [bitset!] /local b1 b2] [
(head system/words/clear at to binary! set1 add 1 divide length?
set1 8)
xor
(head system/words/clear at to binary! set2 add 1 divide length?
set2 8)
]
not: func [bitset [bitset!]] [
complement bitset
]
get: func [
bitset [bitset!]
index [integer!] "Which bit (0 to (length? value) - 1)"
][
either empty? and bitset bitset-from-bits index [0][1]
]
set: func [
bitset [bitset!]
index [integer!] "Which bit (0 to (length? value) - 1)"
][
insert bitset index
]
set?: func [
bitset [bitset!]
index [integer!] "Which bit (0 to (length? value) - 1)"
][
either empty? and bitset bitset-from-bits index [false][true]
]
clear: func [
bitset [bitset!]
index [integer!] "Which bit (0 to (length? value) - 1)"
][
and bitset complement bitset-from-bits index
]
]
; Paste this into the console to see the results in the conext of calls.
print items: make bitset! 40
print bit/set items 0
print bit/set items 39
print bit/set? items 0
print bit/set? items 1
print bit/set? items 38
print bit/set? items 39
print bit/get items 0
print bit/get items 1
print bit/get items 38
print bit/get items 39
print items2: complement items
print bit/and items items2
print bit/or items items2
print bit/bxor items items2
print items: bit/clear items 0
print items: bit/clear items 39
halt
[3/17] from: brett:codeconscious at: 28-Sep-2002 0:24
Hi Romano,
> I do not understand how can be used all this new stuff from the new betas.
Can
> someone make an useful example to me? I knows only how to use bitset in
parse
> rule for char, but this seems something different.
A number inserted into a bitset is turns on a bit in the position indicated
by that number. Starting with zero. So if you have a set of things and to
each thing you assign an integer, you could encode the presence of each
thing in a bitset
people: [{Romano} {Anton} {Allen} {Gabriele} {Petr}]
0 will represent Romano
1 will represent Anton
etc.
Now I can use a bitset to represent set membership. My first set will be
people that have Italian email addresses:
italian-email-address: make bitset! 8
insert italian-email-address 0
insert italian-email-address 3
I can query the set - does Petr have an Italian email address?
>> find italian-email-address 4
== none
No is the answer.
I can create another set this time of people that have a name in which the
first letter falls into the A-K range.
a-k-names: make bitset! 8
repeat n [1 2 3] [insert a-k-names n]
Now if I can calculate the set of people that belong to both sets.
resulting-set: intersect italian-email-address a-k-names
But there does not seem to a function to enumerate a bitset - so I do it
myself and find that Gabriele is a member of both sets:
>> repeat i subtract length? resulting-set 1 [if find resulting-set i
[print i]]
3
== none
Bitset lengths are always a multiple of 8. If you have 18 possible values
you will need a bitset of length 24
make bitset! 24
length? returns the capacity of the bitset
>> length? make bitset! 24
== 24
empty? will show you if any bits are set:
>> empty? make bitset! 2000
== true
Clear resets all the bits:
>> clear probe insert make bitset! 8 5
make bitset! #{20}
== make bitset! #{00}
On my machine and with this version a bitset seems to have a maximum size of
2040.
Here's a little function to represent the bits of a bitset:
stringbits: func [bitset [bitset!]][
enbase/base head reverse to-string load find/tail/last form bitset
{ } 2
]
This nicely shows how each number is mapped to a bit position:
>> stringbits insert make bitset! 8 0
== "00000001"
>> stringbits insert make bitset! 8 1
== "00000010"
>> stringbits insert make bitset! 8 2
== "00000100"
>> stringbits insert make bitset! 8 3
== "00001000"
>> stringbits insert insert make bitset! 8 0 3
== "00001001"
Don't know if that was what you wanted.
I found this article when looking at what bitsets can be used for. Quite
interesting I thought:
http://www.flipcode.com/tutorials/tut_bloomfilter.shtml
Regards,
Brett.
[4/17] from: rotenca:telvia:it at: 27-Sep-2002 18:07
Thanks Gregg and Brett,
it is exactly what i wanted to understand.
I ask myself if you learned all from that RT doc, i did not understand
anything from it!
> On my machine and with this version a bitset seems to have a maximum size of
> 2040.
On my system:
> make bitset! 1024 * 1024
== make bitset! #{
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000...
>> lenght? make bitset! 1024 * 1024
== 1048576
>> about
REBOL/View 1.2.8.3.1 3-Aug-2002
>Here's a little function to represent the bits of a bitset:
> stringbits: func [bitset [bitset!]][
> enbase/base head reverse to-string load find/tail/last form bitset { }
2
> ]
Very useful. Cannot be removed that to-string?
That reverse will not fail on different system? (big-endian/little endian)
---
Ciao
Romano
[5/17] from: rotenca:telvia:it at: 27-Sep-2002 18:48
Hi Gregg,
> ; There has *got* to be a better way than this!
> bxor: func [set1 [bitset!] set2 [bitset!] /local b1 b2] [
<<quoted lines omitted: 4>>
> set2 8)
> ]
Can be?
exclude union set1 set2 intersect set1 set2
---
Ciao
Romano
[6/17] from: greggirwin:mindspring at: 27-Sep-2002 10:53
hi Romano,
<< I ask myself if you learned all from that RT doc, i did not understand
anything from it! >>
Well, I know how I *wanted* them to work, but some things don't quite. :) I
played around with things quickly to figure out what I did. I got
unsubscribed briefly so I just read Brett's note on eScribe and updated my
stuff to use FIND for the GEt and SET? functions since I didn't know that
worked. If XOR and REMOVE worked as well, we'd be all set. :)
--Gregg
[7/17] from: rotenca:telvia:it at: 27-Sep-2002 19:18
> Hi Gregg,
> > ; There has *got* to be a better way than this!
<<quoted lines omitted: 7>>
> Can be?
> exclude union set1 set2 intersect set1 set2
Better?
difference set1 set2
---
Ciao
Romano
[8/17] from: greggirwin:mindspring at: 27-Sep-2002 11:20
<<
Can be?
exclude union set1 set2 intersect set1 set2
>>
I think so! That's what I get for focusing on trying to make the native XOR
work. :)
Thanks!
--Gregg
[9/17] from: greggirwin:mindspring at: 27-Sep-2002 13:11
<<
Better?
difference set1 set2
>>
DOH! :) OK, so this brings up a question. Does anyone have something that
will list all the functions that support given datatypes as parameters?
I can't blame my ignorance on not having such a tool though. I had
DIFFERENCE listed as a function that supports bitsets. I just wasn't
thinking, obviously :\
Thanks Romano!
--Gregg
[10/17] from: greggirwin:mindspring at: 27-Sep-2002 13:32
This all brings up a really good point for me. Sometimes I'm so stuck on
certain things I can't see the obvious. How many of us have suffered with
AND, OR, and XOR in our code, stopping to think so hard about whether we had
the right operator in there, particularly when you have to distinguish
between bitwise and logical operations!
INTERSECT, UNION, and DIFFERENCE are so much more human friendly! Just
looking for quick definitions I find:
Intersect - To have points in common.
Union - The set, each of whose elements lies in at least one
member of the collection.
Difference (symmetric) - The set of elements that belong to one
but not both of two given sets; the union of their relative
complements; the relative complement of their intersection
in their union.
Carl S. probably just shakes his head sometimes when he sees how hard some
of us (OK, me) make things when he has made them so easy. Hopefully he'll be
patient as we unlearn things. :)
Thanks Carl!
--Gregg
[11/17] from: rotenca:telvia:it at: 27-Sep-2002 22:21
In 1.2.8 accept bitset!:
system/words/negate
system/words/complement
system/words/tail?
system/words/length?
system/words/find
system/words/copy
system/words/insert
system/words/remove
system/words/clear
system/words/exclude
system/words/difference
system/words/intersect
system/words/union
system/words/empty?
system/words/copy
It seems that remove part can delete bit, the integer arg is the bit number:
>> remove/part probe insert make bitset! 8 0 0
make bitset! #{01}
== make bitset! #{00}
---
Ciao
Romano
[12/17] from: greggirwin:mindspring at: 27-Sep-2002 15:53
<< It seems that remove part can delete bit, the integer arg is the bit
number: >>
Excellent! That seems to work here too!
Thanks!
--Gregg
[13/17] from: brett:codeconscious at: 28-Sep-2002 13:01
Hi Gregg, Romano,
> I ask myself if you learned all from that RT doc, i did not understand
> anything from it!
I'd heard that some DBs nowadays incorporate bit vector indexes - not sure
what the term is and after RT made those notes I figured it was time to find
out what bitset! could be used for. So I searched the internet for examples
and experimented with various series functions. So no, I didn't get it from
the docs :^)
> > On my machine and with this version a bitset seems to have a maximum
size of
> > 2040.
I need to correct this - after seeing your post.
I'm not sure what the maximum is now - could be as much as OS is willing to
provide perhaps.
For example this works, but I can here my disk starting to work heavily.
r: make bitset! 1024 * 1024 * 768
== make bitset! #{
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000...
But 1024 * 1024 * 1024 causes a crash.
My comment about multiples of 8 could be wrong too, not sure what the actual
constraint is.
> >Here's a little function to represent the bits of a bitset:
> > stringbits: func [bitset [bitset!]][
> > enbase/base head reverse to-string load find/tail/last form bitset
{ }
> 2
> > ]
>
> Very useful. Cannot be removed that to-string?
Probably, it evolved :^)
> That reverse will not fail on different system? (big-endian/little endian)
Another good point. I'm using Win NT 4.0 on Intel. What happens on other
systems?
Thanks for the info on REMOVE as well.
Regards,
Brett.
[14/17] from: rotenca:telvia:it at: 28-Sep-2002 20:27
Hi, Brett
> > That reverse will not fail on different system? (big-endian/little endian)
>
> Another good point. I'm using Win NT 4.0 on Intel. What happens on other
> systems?
No one with Rebol running on Amiga?
---
Ciao
Romano
[15/17] from: carl:cybercraft at: 29-Sep-2002 9:52
On 29-Sep-02, Romano Paolo Tenca wrote:
> Hi, Brett
>> > That reverse will not fail on different system?
<<quoted lines omitted: 3>>
>> systems?
> No one with Rebol running on Amiga?
Thought I could safely avoid this thread, but it seems not. (; On a
32Meg Amiga...
>> r: make bitset! 1024 * 1024 * 768
** Script Error: Not enough memory
** Near: r: make bitset! 1024 *
>> r: make bitset! 1024 * 1024
== make bitset! #{
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000...
>> r: make bitset! 1024 * 1024 * 256
** Script Error: Not enough memory
** Near: r: make bitset! 1024 *
>> r: make bitset! 1024 * 1024 * 2
== make bitset! #{
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000...
>> r: make bitset! 1024 * 1024 * 4
== make bitset! #{
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000...
>> r: make bitset! 1024 * 1024 * 8
== make bitset! #{
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000...
>> r: make bitset! 1024 * 1024 * 16
== make bitset! #{
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000...
>> r: make bitset! 1024 * 1024 * 32
== ** Script Error: Not enough memory
** Near: ** CRASH (Should not happen) - Corrupt datatype: 120
View failed returncode 100
Happy now? (:
--
Carl Read
[16/17] from: rotenca:telvia:it at: 29-Sep-2002 23:42
Hi, Carl,
> >> > That reverse will not fail on different system?
> >> > (big-endian/little endian)
<<quoted lines omitted: 5>>
> 32Meg Amiga...
> >> r: make bitset! 1024 * 1024 * 768
Thanks Carl, but we want to know if the reverse fails on a different system.
This is the test and the result on Intel:
enbase/base head reverse load skip form insert make bitset! 16 15 13 2
== "1000000000000000"
We fear that on amiga it will be "0000000010000000"
Ciao
Romano
[17/17] from: carl:cybercraft at: 30-Sep-2002 11:40
On 30-Sep-02, Romano Paolo Tenca wrote:
> Hi, Carl,
>>>>> That reverse will not fail on different system?
<<quoted lines omitted: 11>>
> 2 == "1000000000000000"
> We fear that on amiga it will be "0000000010000000"
Ah. I looked back too far into the mails. (: No, it behaves as on
Intel...
>> enbase/base head reverse load skip form insert make bitset! 16 15
13 2
== "1000000000000000"
Is this consistantly good or consistantly bad? (;
--
Carl Read
Notes
- Quoted lines have been omitted from some messages.
View the message alone to see the lines that have been omitted