World: r3wp

Join the discussions in the REBOL3 world...

[Profiling] Rebol code optimisation and algorithm comparisons.

older newer	first last
Maxim 18-May-2010 [110x2]	ultimate-find: func [ series value index "field you want to search on, should be (1 <= index <= record-length)" record-length i "iterations" /local s st result ][ prin "ultimate find(): " st: now/precise while [i > 0][ prin i prin "." result: clear [] ;length? series s: at series index until [ not all [ s: find/skip s value record-length insert tail result copy/part skip s (-1 * index + 1) record-length s: skip s 3 ] ] i: i - 1 ] prin join " -> " difference now/precise st print [" " (length? result) / record-length "matches found"] head result ]
Maxim 18-May-2010 [110x2]	searching for strings will be slower... probably much slower... just try it out with some data :-D
Terry 18-May-2010 [112]	what does it mean "field you want to search on"?
Maxim 18-May-2010 [113]	what item of your record you want to match against... basically what you meant by searching subject, predicate or value
Terry 18-May-2010 [114]	yeah, so if i say 1 then that's all subjects?
Maxim 18-May-2010 [115]	yep it will match the value against subjects only.
Terry 18-May-2010 [116x4]	no go then
	>> dataset == [6 27744 92191 1 61175 9905 9 62225 72852 7 31935 71556 4 59248 >> ultimate-find dataset 1 1 zz 1 ultimate find(): 1. -> 0:00:00.031 0 matches found == []
	should have picked up that fourth index (1)
	this worked.. >> ultimate-find dataset 6 1 zz 1 ultimate find(): 1. -> 0:00:00.219 1 matches found
Maxim 18-May-2010 [120]	zz should be 3.
Terry 18-May-2010 [121x2]	oh.. i thought zz was the length? of dataset
Terry 18-May-2010 [121x2]	ah.. that works GREAT
Maxim 18-May-2010 [123]	>> dataset: [1 "a" "B" 4 "h" "V" 1 "z" "Z" 4 "p" "d" 4 "k" "i" 4 "y" "o"] == [1 "a" "B" 4 "h" "V" 1 "z" "Z" 4 "p" "d" 4 "k" "i" 4 "y" "o"] >> ultimate-find dataset 4 1 3 1 ultimate find(): 1. -> 0:00 4 matches found == [4 "h" "V" 4 "p" "d" 4 "k" "i" 4 "y" "o"]
Terry 18-May-2010 [124]	very nice
Maxim 18-May-2010 [125x2]	ultimate find(): 1. -> 0:00 1 matches found == [1 "a" "B"] :-)
Maxim 18-May-2010 [125x2]	oops missing cmd line...
Terry 18-May-2010 [127]	so if the dataset is key/value just use 2 as the record-length
Maxim 18-May-2010 [128x2]	>> ultimate-find dataset "a" 2 3 1 ultimate find(): 1. -> 0:00 1 matches found == [1 "a" "B"]
Maxim 18-May-2010 [128x2]	yep
Terry 18-May-2010 [130x3]	cool
	i inserted "maxim" "age" "unknown" and appended "terry" "age" "42" into the dataset containing 6 million records.. >> ultimate-find dataset "age" 2 3 1 ultimate find(): 1. -> 0:00:00.093 2 matches found == ["maximn" "age" "unknown" "terry" "age" "42"]
	I'll say that's a respectable time... and the leading contestant :)
Maxim 18-May-2010 [133]	:-)
Terry 18-May-2010 [134x4]	now if only i was 42 again...
	But wait, there's more.... convert dataset to hash! and run ultimate-find again!
	>> ultimate-find dataset "age" 2 3 100 ultimate find(): -> 0:00 2 matches found == ["maximn" "age" "unknown" "terry" "age" "42"] 100 iterations not even registering
	1000 iterations 0.40
Maxim 18-May-2010 [138]	OMG !
Terry 18-May-2010 [139]	exactly
Maxim 18-May-2010 [140x2]	but I'm getting an odd deadlock here on some tests... hum...
Maxim 18-May-2010 [140x2]	I'm getting extremely slow results on dense tests...
Terry 18-May-2010 [142x2]	interesting... im not too worried as density isn't a big issue with triple stores
Terry 18-May-2010 [142x2]	im off.. good luck with your optimizations
Maxim 18-May-2010 [144]	I'm talking like 100 times worse! the larger the list the worse it gets... seems like an exponential issue.
Terry 18-May-2010 [145]	that seems like an anomaly
Maxim 18-May-2010 [146]	both dense tests perform pretty much the same, the moment I convert it to a hash, it gets reallllly slow.
Terry 18-May-2010 [147x2]	yeah, i see that too
Terry 18-May-2010 [147x2]	mind you, that's pretty dense data
Maxim 18-May-2010 [149]	the strange thing is i did tests using a record size of 2, which wouldn't trigger strange mis aligned key/value issues. I even removed the copy to make sure that wasn't the issue and one test with only 400000 records took more than 4 minutes to complete vs .297 for the feach test!
Terry 18-May-2010 [150x2]	I'm looking for the 6 integer.. it's still cranking and i can hear my system struggling..
Terry 18-May-2010 [150x2]	must be a loop error
Maxim 18-May-2010 [152]	well, the results where the same at the end... pretty weird... maybe someone has encountered this before and can explain why this happens....
Pekr 18-May-2010 [153]	Max - just a question - wouldn't using parse be faster than find/skip?
Ladislav 18-May-2010 [154]	my advice would be: 1) to test Parse as Pekr noted (traversing only the respective field) 2) to use a hash to index the respective field
Maxim 18-May-2010 [155]	I didn't do any parse test tweaks... but find/skip is very fast so far, we can skip over 100 million records within a millisecond. not sure parse can beat that
Terry 18-May-2010 [156]	Did you find a solution to the density issue Max?
Maxim 18-May-2010 [157]	nope... I'm working on urgent stuff won't have time for a few days to put more time on this.
Steeve 18-May-2010 [158]	didn't tested since a while in R2, but in R3, parse is faster in most of the cases (if you write correctly the rules)
Terry 18-May-2010 [159]	I'm wondering if it has something to do with recreating the hash each time a value is found?
older newer	first last