r3wp [groups: 83 posts: 189283]
  • Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

World: r3wp

[Profiling] Rebol code optimisation and algorithm comparisons.

Terry
22-May-2010
[310]
(I prefer the term "things" over "objects" as triples don't usually 
contain methods, just properties)
Terry
26-May-2010
[311]
Any thoughts on speeding up intersect when comparing large blocks?
Ladislav
26-May-2010
[312x3]
You should not use Intersect on blocks
(if you want speed)
my suggestion is to try it on hashes
Ashley
26-May-2010
[315]
intersect is a native so the only way you can get faster performance 
is if you know something about the blocks contents. For example, 
if both blocks contain sorted integers *and* one block is smaller 
than the other *and* the bigger block contains the first value of 
the smaller then the following code *may* be faster than intersect:

	a: [2 3 4]
	b: [1 2 3 4 5]
	x: last b
	b: find b first a
	r: copy []
	foreach i a [
		all [i > x break]
		all [find b i append r i]
	]
	b: first b
Ladislav
27-May-2010
[316x2]
When using Intersect on blocks, you should take into account, that 
their contents are hashed first. Therefore, using hashes instead 
of blocks, the hashing operation itself may be spared.
but, I did not do any speed comparisons, so it is up to you, Terry
Terry
27-May-2010
[318x2]
i just tried some comparisons and the time is the same
however, I think I've finally created a rocket db
Ladislav
27-May-2010
[320x2]
the time is the same

 - then, it looks, that the Intersect native is not "clever enough", 
 doing an unnecessary operation.
nevertheless, hashing can certainly be spared, so, doing Intersect 
"by hand" can be faster, in my opinion
Terry
27-May-2010
[322x2]
i have a block with 1 million sub blocks representing objects, and 
they each have a random 'age value  between 1 - 100.. i can pull 
the indexes of all the blocks where age = 42 in 0.7 seconds.. 
that's the slowest query i have.
in that same 1 million objects, i have an object with fname = "Bob" 
.. i can pull that out in 0.0014 secs.
Steeve
27-May-2010
[324]
There should be a ratio where the iterative method is faster than 
intersect, but some tests have to be done.
ratio = (lenght? bigest-block) / (length? smallest-block)
Terry
27-May-2010
[325]
In my comparisons with Redis, I was getting 7200 GETS (pulling the 
value of a key) per second from a DB with 100,000 key/value pairs.. 
 With this rebol db, i can do 1,000,000 GETS in 0.64 seconds, from 
a db with 1,000,000 key/value pairs.
Ladislav
27-May-2010
[326]
Terry's informations are quite inexact. Intersection of hashes is 
*much* faster, than intersection of blocks, exactly as I expected.
Oldes
27-May-2010
[327]
I agree:
>> b1: make hash! [1 2 3 4] b2: make hash! [2 3 4 5 6]
>> tm 100000 [intersect b1 b2]
== 0:00:00.313
>> b1: make block! [1 2 3 4] b2: make block! [2 3 4 5 6]
>> tm 100000 [intersect b1 b2]
== 0:00:00.766
Terry
27-May-2010
[328]
My timings were for a script, where intersecting was one part. Whether 
intersecting block or hash made no noticeable difference.
Ladislav
27-May-2010
[329]
Well, since your timings did not detect, that hash intersects are 
*much* faster than block intersects, this is a proof, that your timings 
do have very little in common with the speed of intersect
Terry
27-May-2010
[330]
one odd thing i find with timing in general is why do i get random 
values each time i run a script? shouldn't it always be the same?
PeterWood
27-May-2010
[331]
I would be very suspicious if the timings where the same. All machines 
these days are running multiple processes so your script's access 
to the processor will differ each time you run it.


Rebol itslef may well be inconsistent as the garbage collector will 
run at differnet points in you tests.
Oldes
27-May-2010
[332]
What do you mean by random values? I can see some differencies, but 
not so random. The function I use for timing is:

tm: func [count [integer!] code [block!] /local t][t: now/time/precise 
loop count code probe now/time/precise - t]
Ladislav
27-May-2010
[333]
As opposed to that, I propose a more precise http://www.fm.tul.cz/~ladislav/rebol/timblk.r
Steeve
27-May-2010
[334]
Vista --> 0.000990825688073395
Ladislav
27-May-2010
[335]
thanks, Steeve, that looks like 1 millisecond, taking into account 
the given 5% precision
Ladislav
31-May-2010
[336]
Hi all, I adjusted the  http://www.fm.tul.cz/~ladislav/rebol/timblk.r
file to reflect Steeve's measurement, but would like to obtain more 
results (results for other operating systems). If you have the results 
that are not already in the file, please, let me know.
Oldes
31-May-2010
[337x2]
R2, XP ---> 1.54857142857143E-2
sorry, it's already in the file., And it's XP SP3
Paul
31-May-2010
[339]
9.8876404494382E-4 Vista
Gregg
31-May-2010
[340]
XP x64
== 1.54678899082569E-2
Ladislav
31-May-2010
[341x2]
thanks, I assume, Gregg, that your value is either for XP x64 SP2 
or SP3?
hmm, Paul's value is quite strange, taking into account, that Steeve 
measured it to be 1 ms
Paul
31-May-2010
[343]
mine is on 32 bit Vista running AMD.
Ladislav
31-May-2010
[344]
aha, OK, the difference is still just 1.12%, i.e. it is in the tolerance
Gregg
31-May-2010
[345]
Sorry, yes, SP2.
Steeve
31-May-2010
[346x2]
Btw, got the same results with R2 and R3
This test shows well why some graphical demos work bad on some systems 
even if the computers are faster than mine  (Vista with a tiny celeron).
Ladislav
31-May-2010
[348x2]
hmm, I just checked a Vista Business 64 bit (not updated for quite 
some time), and it used 15.5 milliseconds too...
The same result with R2 and R3
 - of course, this value is a property of the operating system
Maxim
31-May-2010
[350x2]
it could have been that they use different timer libs.
or whatever lib the core uses to calculate time
Ladislav
31-May-2010
[352x2]
well, this is not about "time calculation", it is about frequency 
of time update
but, certainly, it is possible to update the time every time the 
operating system is asked to provide the time. That way, the time 
can be updated as frequently, as the program is able to ask. I guess, 
that in Linux it works that way, any results?
Maxim
31-May-2010
[354x2]
Although I can only guess, I think the issue lies in that the actual 
os calls being used do not provide greater precision.


In R3 there are other means to get the time which do provide much 
greater precision, so its strange that they do not also apply to 
now/precise.
wrt "get the time"... I meant to say:

get timing information
Ladislav
31-May-2010
[356]
...its strange that they do not also apply...

 - nothing strange, it is a different kind of counter, not the one 
 used to get the system time (that might mean, that it actually is 
 less precise, e.g.)
Maxim
31-May-2010
[357x2]
What I think too...


but if R3 can get precise timing info, it could arguably add this 
precise elapsed period since last *time* check and add it to the 
result when now/precise is called in a script. 

just an idea... thinking out loud.
even in R2, when you look at an event! object, the timing counter 
within is much more precise than now/precise...  which is why I can 
use mouse move events to check time elapsed much more precisely than 
the mediocre 'time event generating, OS timer which /view is using.
Ladislav
31-May-2010
[359]
precise timing

 - that is exactly the opposite meaning than what I am using. Example:


clocks usually display time in seconds, using second as the smallest 
unit. That does not mean, clocks are imprecise, they may actually 
be more precise, than a counter counting every microsecond, e.g., 
but with a 0.1 % error, meaning, that in one day, such a counter 
may be e.g. one and a half minute slow in one day