World: r3wp
[Core] Discuss core issues
older newer | first last |
Anton 20-May-2007 [8077] | (ie. how many unique integer values are needed ?) |
Terry 20-May-2007 [8078] | no negatives.. and a MAX of 32 bits is more than enough for the largest number |
Sunanda 20-May-2007 [8079] | A value slot is 16 bytes, so a single integer takes 16 bytes -- it has to live in a value slot. |
Terry 20-May-2007 [8080x2] | 16 bytes.. that seems large |
I guess that's where the number of words limitation kicks in? | |
Anton 20-May-2007 [8082] | No, the slot size is fixed so it can fit a whole lot of different types in it, eg. time! values. |
Sunanda 20-May-2007 [8083] | word limit is something else -- to do with the number of unique (across all contexts) words. Not related to value slots as far as I know. |
Terry 20-May-2007 [8084] | Im thinking I should do this in Assembly ;) |
Anton 20-May-2007 [8085] | Well, what is your largest number ? |
Terry 20-May-2007 [8086] | ahh .. ok.. that's a fairly heavy price to pay (relatively speaking) just so I can store non binary data types |
Anton 20-May-2007 [8087] | That dictates the number of bits needed to represent all your numbers. |
Sunanda 20-May-2007 [8088] | For compact representation of large integer sets, I often uses the format: [10x4 50 76x1000] ir REBOL pairs! (each taking 1 value slot) meaning [10 11 12 13 50 76 77 78 ..... 1076] This works for me for large blocks of semi-sparse integers. There is a script for handling this format at REBOL.org -- look for rse-ids.r |
Terry 20-May-2007 [8089] | Im guessing my largest number is probably around 2 million... approx |
Anton 20-May-2007 [8090] | You have to know what your largest number is. Otherwise your software will break with overflow errors. |
Terry 20-May-2007 [8091] | well.. say 21 bits |
Anton 20-May-2007 [8092x3] | Well 24bits (3 bytes) --> 2^24 = 16777216 unique values. |
Ok, so you could pack each 21 (or 24) bits into a binary. | |
No wastage. | |
Terry 20-May-2007 [8095] | so is that a block of binaries then? |
Anton 20-May-2007 [8096] | No, a single binary. |
Terry 20-May-2007 [8097] | yeah for each... but i need to group them |
Anton 20-May-2007 [8098x3] | You have to write the access methods to index into your binary though. |
(So for that I recommend just using 3 bytes.) | |
Ok, well you can have several binaries, why not ? | |
Sunanda 20-May-2007 [8101x2] | Anton, I think, is suggesting you do something like this: my-list: make binary! 25000 Then use insert or append to store 3-byte values to the binary string you have created. That way you need only 1 value slot plus the length of the binary string. |
oops -- anton typed faster than me :-) | |
Terry 20-May-2007 [8103] | ahh ok.. ie: [a4b 4°¢ ÑAg] etc ? |
Sunanda 20-May-2007 [8104] | That would be a block of binary....each binar would take a slot Anton is suggesting this sort of approach: x: make binary! 25000 >> loop 18 [insert x to-char random 255] == #{5E218289FC8B65B86A1C597C232F9E8C79} Just one binary string to hold the data. |
Terry 20-May-2007 [8105x2] | is there a function to convert an integer to 3 bytes? |
ahh.. nice | |
Anton 20-May-2007 [8107x4] | No, you must make the functions to convert 3-byte-int <--> integer! |
>> bin: #{000005000006000007} >> repeat index 3 [bin-int: copy/part skip bin (index - 1 * 3) 3 print [index mold bin-int to-integer bin-int]] 1 #{000005} 5 2 #{000006} 6 3 #{000007} 7 | |
binary! is indexed per byte, so if we use 3 bytes per value, we must multiply the index by 3 to find the correct value. | |
We must do this in our functions which should allow retrieving, changing and inserting the value. | |
Terry 20-May-2007 [8111] | would there be a performance trade off as opposed to just a block of integers? (finding/ appending etc.) |
Anton 20-May-2007 [8112x3] | Note that converting integer <-> binary is a little tricky. But we can use a struct! (re-use a shared struct for optimal performance). |
Yes, there is lost performance. Obviously, you must go via your access functions each time you store and retrieve. | |
(This is addressed by Rebol3's vector! datatype, which stores different values this way.) | |
Terry 20-May-2007 [8115x2] | well, performance is the primary concern |
i thought crawling a block of binary (hash?) would be faster than a block of integers | |
Anton 20-May-2007 [8117] | How many numbers are there likely to be ? What do they represent ? |
Terry 20-May-2007 [8118] | Here's what Im trying to do.. I have a dictionary... dict: ["one" "two" "three" ... ] with a couple of million values I want to build an index block .. dex: [ 2 3 2 1] that represents the index? of each word in my dictionary |
Anton 20-May-2007 [8119] | Is this to get around rebol's current word limit ? |
Terry 20-May-2007 [8120x5] | Only partly |
but there's a compression value as well.. | |
a value in the dictionary could be a page of rebol code (as a string) .. and it could be represented with a single bit (or in this case.. 3 bytes) | |
the values never change.. they may be deleted... and their index re-valued.. or the dictionary may be appended.. | |
so then my code becomes a block of 3 byte 'symbols' that I use to 'pick' the actual values out of the dictionary with. | |
Anton 20-May-2007 [8125x2] | Is the aim to create a fixed-width number representation for the index ? |
(I mean, a fixed-width of 3 characters.) | |
older newer | first last |