• Home
  • Script library
  • AltME Archive
  • Mailing list
  • Articles Index
  • Site search
 

AltME groups: search

Help · search scripts · search articles · search mailing list

results summary

worldhits
r4wp5907
r3wp58701
total:64608

results window for this page: [start: 46201 end: 46300]

world-name: r3wp

Group: Core ... Discuss core issues [web-public]
Ladislav:
9-May-2010
re "ask in chat or CC" - I asked in chat, privately, but do not see 
any reaction yet, and I added a comment to CC #1571
Maxim:
10-May-2010
tuple aren't immutable in R2 OR R3

>> a: 1.2.34
== 1.2.34
>> a/2: 33
== 33
>> a
== 1.33.34

this works in both R2 and R3
Steeve:
10-May-2010
nope, they are

>> a: b: 1.2.3
== 1.2.3
>> a/1: 33
== 33
>> a
== 33.2.3
>> b
== 1.2.3
amacleod:
11-May-2010
BrainH asked: Didn't know you could put the /compare index in a block. 
Can you specify more than one index?

Sunanda says: Yes you can:
sort/skip/compare s 4 [1 2 4 3]
== [1 2 8 a 1 2 6 b 1 2 7 c]


That's great but can you do that with a block of blocks of  data....

example: 
s: [
	[1 2 8 a]
	[1 2 6 b] 
	[1 2 7 c]]
]
Maxim:
11-May-2010
use a compare function for that
Sunanda:
11-May-2010
To turn Maxim's comment into an example:

   sort/compare s func [a b] [
== [
    [1 2 6 b]
    [1 2 7 c]
    [1 2 8 a]
]
s
Sunanda:
11-May-2010
Oops that SORT line should be
   sort/compare s func [a b] [return a/3 < b/3]
Maxim:
11-May-2010
thanks... I was a bit lazy there  ;-)
amacleod:
11-May-2010
so something like this will give me a sort on multi indexes?
sort/compare s func [a b] [return a/4 < b/4 return a2 < b/2]
Maxim:
11-May-2010
btw return is never required at the end of a func .. and slows down 
rebol A LOT in tight loops.
Maxim:
11-May-2010
try this instead:

sort/compare s func [a b] [
	either  a/4 = b/4  [
		a2 < b/2
	][
		a/4 < b/4
	]
]
amacleod:
11-May-2010
I do not know if that is what I want...i'm looking to prioritize 
each compare giving each a "weight"
amacleod:
11-May-2010
example" 


sorting a list of first and last names first on last then on first 
in case of same last names
Sunanda:
11-May-2010
The basic compare function then (assuming you are sorting a block 
of objects) is:

   func [a b] [if a/surname = b/surname [return a/firstname < b/firstname] 
   return a/surname < b/surname]]
Sunanda:
11-May-2010
Basically, yes.....You may find a neat way of refactoring....Or JOINing 
keyfields so you can do a single compare....but that is the basic 
principle.
Maxim:
11-May-2010
late but here is an equivalent working example:

names: [
	["zz" "bb"] ["cc" "zz"] ["aa" "aa"]["bb" "ee"]
	["zz" "tt"] ["cc" "aa"] ["aa" "yy"]["bb" "aa"]
]
sort/compare names func [a b][
	either a/1 = b/1 [a/2 < b/2][a/1 < b/1]
]
probe names

== [ ["aa" "aa"] ["aa" "yy"] ["bb" "aa"] ["bb" "ee"] ["cc" "aa"] 
["cc" "zz"]  ["zz" "bb"] ["zz" "tt"] ]
Maxim:
11-May-2010
yep... as sunanda said, you need to chain the compares.


there might be a way of using ANY/ALL functions too, but it can be 
tricky ... though usually yields less code.
Sunanda:
11-May-2010
Something like this will sort on all fields
 

sort/compare s func [a b] [for n 1 length? a 1 [if a/:n < b/:n [return 
-1] if a/:n > b/:n [return +1]] return 0]
Maxim:
11-May-2010
refining sunanda's example, sorting on selected fields

sort/compare s func [a b] [
	repeat n [2 4] [
		if a/:n < b/:n [return -1] if a/:n > b/:n [return +1]
	] 
	return 0
]


I just indented it to make it a bit easier to break up
BrianH:
13-May-2010
That might be tricky though, because different FTP servers format 
that information differently. A large portion of the source for FileZilla 
or other FTP clients that have GUIs for file listings is different 
parsers for the file listings of different FTP server software.
Graham:
13-May-2010
the r2 formatting code is a page of parse rules
BrianH:
13-May-2010
Practically speaking, there are probably over a hundred different 
FTP server platforms, and the R2 parsing code only supports (hopefully) 
most of them.
Graham:
13-May-2010
So, if you're not looking for a generalized solution.. then it's 
quite doable.
amacleod:
13-May-2010
network-modes? I do not see a ref erence to file dates.
BrianH:
13-May-2010
INFO? is a different function than GET-MODES.
Graham:
15-May-2010
You probably need a custom header
Henrik:
16-May-2010
but if you say that, is there a newer one? this one hangs in OSX.
Terry:
16-May-2010
Q: Can i use select/any against pairs! in a hash! ? Example?
Terry:
16-May-2010
I'm looking for a delmiter in hash keys to separate two integers 
that i can perform a select/any on
Graham:
17-May-2010
interesting to benchmark but I suggested a math solution as normally 
math is faster than anything else
Terry:
17-May-2010
ok, one last one for my rusty rebol

FINDing the index of all occurences of an integer in a series ie:
blk: [ 239 4545 23 655 23 656[
search for 23 and return [3 5]

(not using foreach)
Terry:
17-May-2010
FIND, SELECT and PICK are blazing.. foreach is a game killer. Need 
to work out a way to FIND all values
Sunanda:
17-May-2010
You could tweak something like this:
    res: make block length? blk
    while [ind: find blk 23]
         [print blk append res index? ind blk: skip blk last res]
    blk: head blk
    probe res


But remember the first rule of REBOL Code Golf: parse always wins.....We're 
now just waiting for a parse guru to show us how :)
Graham:
17-May-2010
since your comparison is occuring in a mezzainine
Graham:
17-May-2010
It's because you can't match an integer like a word
Terry:
17-May-2010
Against a block with 100,000 integers

Lad's : 0.047
foreach: 0.044
Pekr:
17-May-2010
A bit adapted Lad's version (does not use substraction):

indices?-4: func [
	series [series!]
	value
	/local result
] [
	result: make series 0
	parse series [
		any [series: 1 1 value (append result index? series) | skip]
	]
	result
]


>> time-block [indices?-2 blk 23] 0,05
== 0.000003936767578125

>> time-block [indices?-4 blk 23] 0,05
== 0.000003753662109375
Graham:
17-May-2010
Remember that we are paying R2 customers .. and R3 is not going to 
be usable for a few more years
Pekr:
17-May-2010
e.g. for me, RebGUI is a dead end. I talked to Bobik, and he is back 
to VID for simple stuff. There were many changes lately, and some 
things got broken, and it does not seem to be supported anymore. 
As for GUI, I believe that in 2-3 months, you will be able to talk 
otherwise, as Robert wants to move his tools to R3 definitely ...
Pekr:
17-May-2010
I don't care about the stickers. Someone said it is alpha, so tell 
to your brain, that it is a beta, and you are OK, easy as that :-) 
I use products according to functionality, not its official alpha/beta/theta 
status ...
Terry:
17-May-2010
>> a

== [23 43 55 28 345 99 320 48 22 23 95 884 1000000 999999 999998 
999997 999996 999995 999994 999993 999992 999991 999990 999989 999...
>>

(a is a block with 1, 000,012  integers) 


ind: func[series value x][


	st: now/time/precise
		
	while [x > 0][
		
		result: make series 0
		series: head series

  parse series[any [series: 1 1 value (append result index? series) 
  | skip]] 
		x: x - 1
	]
	et: now/time/precise
	fin: et - st
	print fin
	result 
]

feach: func[series value x][

	result: make series 0 
	st: now/time/precise
	
	while [x > 0][

  foreach[s p v] series [if s = value [append result reduce [s p v]]]
		x: x - 1
	]
	et: now/time/precise
	fin: et - st
	print fin
	result 
]

>> ind a 23 10
0:00:01.249
== [1 10 999990]
>>

>> feach a 23 10
0:00:01.01

== [23 43 55 23 95 884 23 43 55 23 95 884 23 43 55 23 95 884 23 43 
55 23 95 884 23 43 55 23 95 884 23 43 55 23 95 884 23 43 55 23 9...
>>

10 iterations each.. 


foreach is the winner speed wise.. as a bonus, If i use foreach, 
I don't need the index?
Terry:
17-May-2010
Still this is too slow.. it's fine for 1M data, but 10M and it grinds 
hard.

.. there must be a faster way to find integers in a block (or hash) 
using SELECT, FIND or INDEX?
Maxim:
17-May-2010
yes... you do a find, within a while block.   that was the fastest 
loop I did for matching exact SAME? strings.  I'd bet its the best 
here too.
Maxim:
17-May-2010
also, until is a bit faster than while IIRC.
Maxim:
17-May-2010
in your tests, this causes your loops to slow down a lot:

result: make series 0

you should:

result: make series length? series. 


because when appending, you will be re-allocating the result over 
and over... and the GC is the loop killer in every single big dataset 
tests I've done.  not because its bad, per se, but because you are 
forcing it to operate.
Maxim:
17-May-2010
I'm building a little example which I think will outperform your 
examples... I'm curious.
Terry:
17-May-2010
moving the result: make series 0 out of the while loop had a 10 milli 
improvement over 10 iterations.
Maxim:
17-May-2010
that might account for the rarity of results.  if your data has a 
lot of occurences, then that number will increase, since your result 
block will grow much more.
Gregg:
17-May-2010
You may need to move beyond brute force Terry, and set up a data 
structure to optimize the searching.
Terry:
17-May-2010
There must be a way.

An index is a symbol that represents some value. What I need is to 
add some metadata to that index, and make that searchable.
Terry:
17-May-2010
the goal is a blazing key/value store that's as fast as pulling by 
index
:)
Terry:
17-May-2010
The other thing i considered was using pairs! to as a pair of symbols, 
but can't search those either without foreach
ie: [ 23x54 "value 1" 984x2093 "value 2"]
Maxim:
17-May-2010
10 x 10 million items, with a single value within the block...  

0:00:00.234  using a 1.5GHz laptop.
Maxim:
17-May-2010
plus record-size is variable, and is supplied as a parameter to the 
function.
Terry:
17-May-2010
On your doc, regarding foreach you mentioned.. 

blocks get faster as you grab more at a time, while strings slow 
down, go figure!
 
but the stats look the opposite?
Maxim:
17-May-2010
if you look at the speeds,  cycling through two items at a time should 
give roughly twice the ops/second.


when you look at the results... blocks end up being faster than this, 
and strings are less than this.
Maxim:
17-May-2010
btw... I discovered  //  instead   of   mod,   which is twice as 
fast..... turns out  MOD is a mezz... never noticed that.
Andreas:
17-May-2010
here's a tiny bit of setup code, adjust dim1/dim2 as you wish. then 
find the needles in the haystack: indices? haystack needle
Maxim:
17-May-2010
I'm working on a (bigger) script which has similar setup code... 
specifically meant to compare different dataset scenarios
Andreas:
17-May-2010
interestingly enough, a naive c extension is only negligibly faster
Andreas:
17-May-2010
null just creates a result block, to demonstrate that 50% of runtime 
is mem allocation for the 10m result array (so that's where one should 
really spend time optimising).


p1 is ladislav's `1 1 value` parse, p2 is pekrs `quote (value)` parse, 
u1/2/3 are the until-based versions shown above, ext is a naive C 
extension.
Maxim:
17-May-2010
I've just finished my tests... I've got a keyed search func which 
returns the exact same results as feach but  20 times faster!

I'll put the whole script in the profiling group... it  has several 
dataset creations for comparison and includes a clean run-time printout.
Paul:
17-May-2010
you means searching a block such as [1 "this" 2 "that" 3 "more"] 
 etc..?
Terry:
18-May-2010
ideally, a large block of  key/values like ["key1" "value 1" "key 
2" "value 2"] with the ability to use pattern matching on keys or 
values... but FAST
Ladislav:
18-May-2010
Terry: "foreach is the winner speed wise.. as a bonus, If i use foreach, 
I don't need the index?" - unbelievable, how you compare apples and 
oranges without noticing
Ladislav:
18-May-2010
Terry: "I don't care" - you should, since you are comparing speed 
of code adhering to different specifications. If you really want 
to find the fastest code for a given specification, that is not the 
way to take.
Maxim:
18-May-2010
so it will grow to match the needs of the dataset, optimising itself 
in size within a few searches and then not re-allocating itself too 
often.
Ladislav:
18-May-2010
there are many important differences:

*make series 0 does not necessarily make a block

*regarding the allocation length - if you can estimate reliably the 
necessary length, then you are better off
Maxim:
18-May-2010
with 10 million records, I estimated my dense datasets at about 120000 
records and did a few tests, they wher MUCH slower than using 
result: clear [ ]
Maxim:
18-May-2010
thing is many times, if not most of the times, you don't need to 
copy the result as a new block, and that also saves A LOT of ram 
in my tests...


overall, about 500MB of RAM where saved by not pre-allocating large 
buffers
Maxim:
18-May-2010
and in any case, you can copy the result of the search, at which 
point you have a perfect size and as little wasted RAM as possible.
Pekr:
18-May-2010
Max - where do I get the dataset from, if I would try to rewrite 
your find-fast into a version using 'parse? :-) Do you generate one?
Maxim:
18-May-2010
look in profiling, there is a full script with verbose printing and 
everything you need, just replace the loop in one of the funcs  :-)
Maxim:
18-May-2010
you can easily compare your results with the current best ... I'll 
be happy if you can beat the ultimate-find and give the exact same 
feature...

searching on any field of a record and return the whole record.
Maxim:
18-May-2010
the process manager reports  a few different values, current, swapped, 
peak, and some more obscure ones.
Maxim:
18-May-2010
though I did a special of XP install which forces the OS NEVER to 
swap... and XP tends to be MUCH smoother because of it.
Henrik:
18-May-2010
I recently watched a talk by Poul Henning Kamp, author of Varnish, 
who talked about how many people misunderstand how memory allocation 
works in modern OS'es. Since he's a FreeBSD kernel developer, he 
has some bias, but he made some interesting points in that memory 
allocation is nearly free in various unixes, but most people ignore 
that an only allocate, perhaps just enough or below what they need.
Maxim:
18-May-2010
(based on rendering 3D animations which required 4GB of swap file 
, just to load a scene  ;-)
Maxim:
18-May-2010
yes... but as long as only one application is running the CPU, you 
can have A LOT of apps in virtual RAM without real system slow down 
(on unix).
Henrik:
18-May-2010
I guess I'm wrong with Windows. allocating a 100 MB string takes 
time.
Terry:
20-May-2010
Q. how to use a word as a string value in path?
ie: ["a" 1 "b" 2]
n: "b"
ie/(n)
>> 2
Claude:
20-May-2010
very strange    a: true      logic? a  => true
Claude:
20-May-2010
but a:[true]  logic? a/1 => false
Sunanda:
20-May-2010
That's because [true] is a word, not a value. Try this:
     a: reduce [true]  logic? a/1  >> true
Terry:
24-May-2010
What's the advantage of using words in blocks? 
ie: [a "one b "two] vs ie2: ["a" "one" "b" "two"]
Terry:
24-May-2010
ie2/("a")
>> "one"
Pekr:
24-May-2010
yes, and the code readability maybe - ie2/("a") vs ie2/a
Henrik:
24-May-2010
using words as table keys is probably not a good idea.
Geomol:
24-May-2010
Is it a benefit, that SWITCH is case insensitive?

>> s: "aA"
== "aA"
>> switch s/1 [#"A" [print "A"] #"a" [print "a"]]
A
Geomol:
24-May-2010
Steeve, no, doesn't work with strings:

>> switch s/1 ["A" [print "A"] "a" [print "a"]]
== none

s/1 is a char! And SWITCH won't find it with a string.
Geomol:
24-May-2010
Gregg, "consistent"? Ahh... ;)


I was thinking about changing the string into a binary. What would 
you think FIRST used on a binary returns? I would expect a binary, 
but it's actually an integer. I sometimes have a hard time seeing 
the consistency in REBOL. But I know, it's hard to find the logic 
way in all this. So many datatypes. :)
Geomol:
24-May-2010
Also changing s/1 to a string gives unwanted result:

>> s
== "aA"
>> switch to string! s/1 ["A" [print "A"] "a" [print "a"]]
A


So I end up with getting the integer value of a char to make this 
work:

>> switch to integer! s/1 [65 [print "A"] 97 [print "a"]]
a

Not so good, as I see it.
Steeve:
24-May-2010
Ah! it's probably impossible to convert a char! into a string!, so 
forget it...
Geomol:
24-May-2010
Yes, I need to test on chars in a string, and #"a" should be different 
from #"A".
Steeve:
24-May-2010
well, just use parse instead, its mostly the same structure than 
a switch.

parse/case s [
	#"A" (do something)
           | #"a" (...)
           | ...
]
Andreas:
24-May-2010
>> #"a" = #"A"
== true
Andreas:
24-May-2010
(R2 is more inconsistent in this regard, as #"a" = #"A" is false, 
but SWITCH behaves case-insensitively, as you described.)
PeterWood:
25-May-2010
>> #"a" == #"A" 
 
== false

Perhaps SWITCH needs a /Strictly refinement?
Gregg:
25-May-2010
SWITCH used to be a mezzanine, so you could easily patch it if you 
want. This is the old mezz source.

switch: func [
    [throw]
    value
    cases [block!]
    /default case
][
    either value: select cases value [do value] [
        either default [do case] [none]]
]
Geomol:
25-May-2010
The char! datatype has many similarities to numbers. The following 
is from R3 and looks strange to me:

>> var - 32 = var
== true


What variable is equal to it's original value, even if you subtract 
32 from it?

>> var: #"a"
== #"a"
>> var - 32 = var
== true


A bit strange. I would prefer SWITCH to distinguish between #"a" 
and #"A".
46201 / 6460812345...461462[463] 464465...643644645646647