World: r3wp
[Core] Discuss core issues
older newer | first last |
Graham 12-Dec-2009 [15200] | My paypal account is ... |
Maxim 12-Dec-2009 [15201] | (hahaha was gonna say... "bills in the mail" ;-) |
Graham 12-Dec-2009 [15202] | too slow ... |
Von 12-Dec-2009 [15203] | Hey, I'd be happy to send some $, if it means speeding up my learning curve! I'll PayPal some money over, seriously! |
Graham 12-Dec-2009 [15204x2] | great .. can pay for my new USB LCD monitor :) [sales-:-compkarori-:-co-:-nz] :) |
.. help pay .. not pay for the whole lthing! | |
Von 12-Dec-2009 [15206] | Done, I've submitted via PayPal :-) Thanks for your help, I get can some rest now :-) Thanks Maxim for your help also! |
Graham 12-Dec-2009 [15207] | Thanks Von .. will certainly encourage others to help you out :) |
Maxim 12-Dec-2009 [15208] | and who said we coudn't make profit by reboling ;-) |
Von 13-Dec-2009 [15209] | Graham, you mentioned that I should encode the password. Is this in case someone hacks into my host? If I use the encloak function, couldn't someone also find my readable key in the script and then decloak my password using Rebol? |
Henrik 13-Dec-2009 [15210x2] | would there be instances where write/lines/append would write a quarter or half a line? I'm logging tests of several script instances into the same file and write/lines/append sometimes produces only half a line in the log. |
sometimes empty lines occur as well | |
sqlab 13-Dec-2009 [15212] | If you write with different rebol instances into the same file at the same time, you are out of luck. I |
Janko 13-Dec-2009 [15213x2] | could you create something like a trie in rebol or would you have to go lower level for it to be normally eficient? |
(let's say I want to use key-value (string-int) pairs for 5M words .. hash tables are probably more memory consuming for such a big set of data?) | |
Maxim 13-Dec-2009 [15215x4] | hash tables for such a big set are the only way to go... they will be magnitudes faster on access. |
I've had REBOL use up over 700MB of RAM without isues or dramatic speed drops... but having millions of items, make sure you pre-allocate your hash-table, cause if you keep-appending to the same table for each entry, it will get exponentially slower. | |
but when things are in the millions, sometimes using a disk-based on-demand caching algorithm is fastest... it really depends on the application. cause think of it this way. every byte used by each element becomes a MB so adds up quickly. 5 million pairs of (10 byte) strings and pairs... is just about 350MB ! >> b: make block! 5000010 >> m: stats == 84172417 >> loop 5000000 [append b copy random "1234567890" append b random 10000000] == ["5862713409" 4765171 "2546013987" 2726704 "9528013746" 3565380 "4591302786" ... >> stats - m == 348435008 | |
(oops 'strings and pairs' > 'string and *integers*' ) | |
Janko 13-Dec-2009 [15219x3] | Maxim .. thanks a lot for your answers.. very interesting .. I know from distance how hashtables work internally but I don't know details.. should a block take roughly the same space as hashtable of the same block (in rebol) or factor(s) different? |
hm.. does stats return ram used?? | |
(it does, cool :) I was looking at processes to see how much it will eat) | |
Maxim 13-Dec-2009 [15222] | hum... lets see: ;-) a: stats b: make block! 5000010 print stats - a == 80001039 a: stats b: make hash! 5000010 print stats - a == 80005071 |
Janko 13-Dec-2009 [15223x2] | >> a: stats b: make block! 1000 repeat i 1000 [ append b random "abcdef" random 100000 ] print stats - a 48671 >> a: stats b: make hash! 1000 repeat i 1000 [ append b random "abcdef" random 100000 ] print stats - a 81454 |
:) | |
Maxim 13-Dec-2009 [15225] | but... filled up.... b: make hash! 5000010 m: stats loop 5000000 [append b copy random "1234567890" append b random 10000000] print stats - m == 188430448 here its half the space. a ha! depending on the string input... hash tables can actually be smaller... :-) |
Janko 13-Dec-2009 [15226] | stats is a cool command , with many refinements also .. I didn't know about it |
Maxim 13-Dec-2009 [15227] | in REBOL, we're a newbie a few minutes... every day.... even after a decade of using it ;-) |
Janko 13-Dec-2009 [15228] | I am nevbie a little longer each day :) |
Maxim 13-Dec-2009 [15229] | hehe |
Janko 13-Dec-2009 [15230x2] | aha, I see that it depends .. I increased the length of string and block increased in size while hash stayed the same |
hm.. I have a very newbie question .. do you most effectively add new pairs to hashtable by appending to it as a block ? can't figure out how to change a value .. set doesn't work that way | |
Maxim 13-Dec-2009 [15232x4] | yep. |
append works on hash tables. in fact they are exactly the same as if you where using blocks, except that the internal representation is different than what you look at through code. | |
a: make hash! [ "33" 33 "44" 44 "55" 55] select a "33" change find a "44" ["88" 88] == make hash! ["33" 33 "88" 88 "55" 55] | |
but janko... if you test it, you will that hash tables are extremely faster at retrieving data... the larger the set the bigger the difference. on millions of records indexed with strings , it could be hundreds or thousands of times faster :-) | |
Graham 13-Dec-2009 [15236] | Von, I think I just mean that your password for emstp will have to be in the script ( if it is needed .. ) |
Janko 13-Dec-2009 [15237] | Maxim: yes, I am aware that retrieving data from hashtables is really fast... I wasn't aware it will just as fast even with 1M records so I was quite amazed before when I tried it |
Pavel 14-Dec-2009 [15238x2] | Transfering memory based hash! (map! in R3) datatype into disk based shema automatically keeping the hash table computation and lookup hidden from user gives you a RIF. Holly grail of all rebollers :) long long time promissed, still waiting to be done. Anyway hash tables are always usually unsorted, when necessary to search in usually some type of additional index is used (B-tree for example), for simple information if the key is in the set, bitmap vectors are used with advantage, when the set is really big (and bitmap vector doesn fit into memory) comressed bitmap may be used and usually bitwise operations on those vectors are much quicker than on uncompressed. Thisi is why it should be used for bitset! datatype anyway. The number of byte aligned (BBC,Packbit,RLE)od word aligned (WAH) schemes exists. It is used in very large datasets when index also resides in disk file. Once again bitwise operation may be much quickier even in memory on those schemes. |
For those interrested a Fastbit webpage is good source of docs. | |
Maxim 14-Dec-2009 [15240x2] | when map! will added to extensions, you might be able implement an example for us and Carl might consider adding your code directly in the host or r3lib if you agree to it. :-) |
you seem to be already knowledged about this, so you'd be the best one to implement it IMHO (pavel). | |
Pavel 15-Dec-2009 [15242] | I'd glad to try, but internals are quite well hidden now. Anyway any hint about handle or crossreferencing from extension you have found Maxim? |
Maxim 15-Dec-2009 [15243] | you mean calling code from the host within extensions? |
Pavel 15-Dec-2009 [15244] | yes I've understand your anouncement this way |
Maxim 15-Dec-2009 [15245x4] | I will be rebuilding the callback example with a much better/simpler design. but they work very well, basically I have mapped the Reb_Do_String() and Reb_Print() functions so that they can be called from within any extension. |
I am also building little helper funcs like a REBOL datatype centric version of sprintf which acts a bit like a C-side rejoin for REBOL. | |
this way we can create rebol code directly from strings and native data very easily. there is currently a size limit on executed strings, its a simple question of optimisation. this means we can't use the wiredf function for creating large datasets via strings (for now). but I'm already doing stuff like: wiredf("rogl-event-handler make wr-event [new-size: %p]", win-w, win-h); calls rebol's do with %p replaced by a pair, using 2 ints. this is a varargs function. | |
(the callback framework is currently called wired) | |
Pavel 16-Dec-2009 [15249] | Thanks for info Maxim. |
older newer | first last |