World: r3wp
[Profiling] Rebol code optimisation and algorithm comparisons.
older newer | first last |
Maxim 18-May-2010 [110x2] | ultimate-find: func [ series value index "field you want to search on, should be (1 <= index <= record-length)" record-length i "iterations" /local s st result ][ prin "ultimate find(): " st: now/precise while [i > 0][ prin i prin "." result: clear [] ;length? series s: at series index until [ not all [ s: find/skip s value record-length insert tail result copy/part skip s (-1 * index + 1) record-length s: skip s 3 ] ] i: i - 1 ] prin join " -> " difference now/precise st print [" " (length? result) / record-length "matches found"] head result ] |
searching for strings will be slower... probably much slower... just try it out with some data :-D | |
Terry 18-May-2010 [112] | what does it mean "field you want to search on"? |
Maxim 18-May-2010 [113] | what item of your record you want to match against... basically what you meant by searching subject, predicate or value |
Terry 18-May-2010 [114] | yeah, so if i say 1 then that's all subjects? |
Maxim 18-May-2010 [115] | yep it will match the value against subjects only. |
Terry 18-May-2010 [116x4] | no go then |
>> dataset == [6 27744 92191 1 61175 9905 9 62225 72852 7 31935 71556 4 59248 >> ultimate-find dataset 1 1 zz 1 ultimate find(): 1. -> 0:00:00.031 0 matches found == [] | |
should have picked up that fourth index (1) | |
this worked.. >> ultimate-find dataset 6 1 zz 1 ultimate find(): 1. -> 0:00:00.219 1 matches found | |
Maxim 18-May-2010 [120] | zz should be 3. |
Terry 18-May-2010 [121x2] | oh.. i thought zz was the length? of dataset |
ah.. that works GREAT | |
Maxim 18-May-2010 [123] | >> dataset: [1 "a" "B" 4 "h" "V" 1 "z" "Z" 4 "p" "d" 4 "k" "i" 4 "y" "o"] == [1 "a" "B" 4 "h" "V" 1 "z" "Z" 4 "p" "d" 4 "k" "i" 4 "y" "o"] >> ultimate-find dataset 4 1 3 1 ultimate find(): 1. -> 0:00 4 matches found == [4 "h" "V" 4 "p" "d" 4 "k" "i" 4 "y" "o"] |
Terry 18-May-2010 [124] | very nice |
Maxim 18-May-2010 [125x2] | ultimate find(): 1. -> 0:00 1 matches found == [1 "a" "B"] :-) |
oops missing cmd line... | |
Terry 18-May-2010 [127] | so if the dataset is key/value just use 2 as the record-length |
Maxim 18-May-2010 [128x2] | >> ultimate-find dataset "a" 2 3 1 ultimate find(): 1. -> 0:00 1 matches found == [1 "a" "B"] |
yep | |
Terry 18-May-2010 [130x3] | cool |
i inserted "maxim" "age" "unknown" and appended "terry" "age" "42" into the dataset containing 6 million records.. >> ultimate-find dataset "age" 2 3 1 ultimate find(): 1. -> 0:00:00.093 2 matches found == ["maximn" "age" "unknown" "terry" "age" "42"] | |
I'll say that's a respectable time... and the leading contestant :) | |
Maxim 18-May-2010 [133] | :-) |
Terry 18-May-2010 [134x4] | now if only i was 42 again... |
But wait, there's more.... convert dataset to hash! and run ultimate-find again! | |
>> ultimate-find dataset "age" 2 3 100 ultimate find(): -> 0:00 2 matches found == ["maximn" "age" "unknown" "terry" "age" "42"] 100 iterations not even registering | |
1000 iterations 0.40 | |
Maxim 18-May-2010 [138] | OMG ! |
Terry 18-May-2010 [139] | exactly |
Maxim 18-May-2010 [140x2] | but I'm getting an odd deadlock here on some tests... hum... |
I'm getting extremely slow results on dense tests... | |
Terry 18-May-2010 [142x2] | interesting... im not too worried as density isn't a big issue with triple stores |
im off.. good luck with your optimizations | |
Maxim 18-May-2010 [144] | I'm talking like 100 times worse! the larger the list the worse it gets... seems like an exponential issue. |
Terry 18-May-2010 [145] | that seems like an anomaly |
Maxim 18-May-2010 [146] | both dense tests perform pretty much the same, the moment I convert it to a hash, it gets reallllly slow. |
Terry 18-May-2010 [147x2] | yeah, i see that too |
mind you, that's pretty dense data | |
Maxim 18-May-2010 [149] | the strange thing is i did tests using a record size of 2, which wouldn't trigger strange mis aligned key/value issues. I even removed the copy to make sure that wasn't the issue and one test with only 400000 records took more than 4 minutes to complete vs .297 for the feach test! |
Terry 18-May-2010 [150x2] | I'm looking for the 6 integer.. it's still cranking and i can hear my system struggling.. |
must be a loop error | |
Maxim 18-May-2010 [152] | well, the results where the same at the end... pretty weird... maybe someone has encountered this before and can explain why this happens.... |
Pekr 18-May-2010 [153] | Max - just a question - wouldn't using parse be faster than find/skip? |
Ladislav 18-May-2010 [154] | my advice would be: 1) to test Parse as Pekr noted (traversing only the respective field) 2) to use a hash to index the respective field |
Maxim 18-May-2010 [155] | I didn't do any parse test tweaks... but find/skip is very fast so far, we can skip over 100 million records within a millisecond. not sure parse can beat that |
Terry 18-May-2010 [156] | Did you find a solution to the density issue Max? |
Maxim 18-May-2010 [157] | nope... I'm working on urgent stuff won't have time for a few days to put more time on this. |
Steeve 18-May-2010 [158] | didn't tested since a while in R2, but in R3, parse is faster in most of the cases (if you write correctly the rules) |
Terry 18-May-2010 [159] | I'm wondering if it has something to do with recreating the hash each time a value is found? |
older newer | first last |