Mailing List Archive: Re: Question and comment about randomizing a block of values

[REBOL] Re: Question and comment about randomizing a block of values

From: joel:neely:fedex at: 18-Jun-2001 17:18


Hi, Jos,

Minor tweaks and a major suggestion...

Jos Yule wrote:
> My question centers on 'shuffling' the deck. I know there are _many_
> methods of shuffling, with various degrees of randomness. I'd like to
> know what you all think about the following 'shuffle' function (this
> function is in a class. the list-of-cards var is a class variable):
>
> shuffle-deck: func [ "Randomizes the list-of-cards"
>         number-of-passes [number!]
> ] [
>         for a-pass 1 number-of-passes 1 [
>                 sort/compare list-of-cards func [a b /local c] [
>                         c: random 2
>                         if c = 1 [return true]
>                         if c = 2 [return false]
>                 ]
>         ]
> ]
>
> Using the sort/compare works really fast, but i'm not sure that this
> is the way its meant to be used ;), but it seems to work quite well,
> especially with multiple passes. If anyone has any suggestions,
> please let me know!
>

Your existing design above can be simplified and sped up by the
following changes:

1)  Since you're not using A-PASS within each pass, you can
    dispense with it (and the overhead of a FOR loop) entirely:

        shuffle-deck: func [ "Randomizes the list-of-cards"
            number-of-passes [number!]
        ] [
            loop number-of-passes 1 [
                ;...
            ]
        ]

2)  We can evolve the comparison function to a much simpler state:

2.1)  Since RANDOM 2 must always return 1 or 2, we don't need two
      IF tests.  This first step gives us:

          func [a b /local c] [
              either 1 = c: random 2 [return true] [return false]
          ]

2.2)  Now C is unnecessary, so eliminate it:

          func [a b] [
              either 1 = random 2 [return true] [return false]
          ]

2.3)  Now we have a tautology; we're returning the same LOGIC! value
      that EITHER is testing, so there's no point in restating it:

          func [a b] [
              return [1 = random 2]
          ]

2.4)  A function returns the last expression evaluated, so we don't
      need to say RETURN anymore:

          func [a b] [1 = random 2]

I've incorporated both of these into the generic block-shuffling
function which follows (after a little block constructor):

    iota: func [n [integer!] /local c r] [
        r: make block! n
        c: 0
        loop n [append r c: c + 1]
    ]

    >> iota 10
    == [1 2 3 4 5 6 7 8 9 10]
    >> iota 2
    == [1 2]

    shuffle-block: func [b [block!] n [integer!]] [
        loop n [sort/compare b func [a b] [1 = random 2]]
    ]

    >> shuffle-block iota 20 10
    == [15 4 13 19 9 16 6 10 12 18 5 20 3 17 14 7 2 1 8 11]

Now for a suggested change to your design...

I can think of two problems with using RANDOM inside the comparison
function:

1)  It adds cycles in the inner-most loop.  The cost of calling a
    user-written comparison function, plus the cost of whatever that
function does, can really add up for large blocks.  (I'm assuming
that the time complexity of SORT is  O(n log n) , which is canonical.)

2)  Since successive calls on RANDOM make the comparison unstable
    that can inflate the time complexity of the sort (depending on
what sort algorithm is used internally).

We can solve both of those by making this observation.  You can
shuffle a block by copying elements from the block using a set of
index values that have been shuffled (as long as each index value
occurs exactly once).  Consider this little modification on IOTA
from above:

    random-iota: func [n [integer!] /local m f c r] [
        f: n + 1
        m: f * f
        c: 0
        r: make block! n
        loop n [append r (f * random m) + c: c + 1]
    ]

This creates a block of (weird-looking ;-) numbers like this:

    >> foo: random-iota 10
    == [859 794 850 1115 1193 963 689 536 1219 714]

But there's method in my wierdness!  Those number actually have
the property that their remainders (mod N + 1) are the set of
numbers 1 thru N, and their "high-order" bits are random!  That
means that we can sort them and use their remainders to get a
shuffled set of indexes, as follows:

    >> sort foo
    == [536 689 714 794 850 859 963 1115 1193 1219]
    >> foreach n foo [prin [n // 11 " "]] print ""
    8  7  10  2  3  1  6  4  5  9

Based on that, we can create a shuffling function that generates
a block of random indexes and then builds a new block by copying
the indexed elements from the original block.

    shuffled: func [b [block!] /local m n i r] [
        r: make block! n: length? b
        i: sort random-iota n
        m: n + 1
        foreach n i [append r pick b n // m]
    ]

which behaves as follows:

    >> test-data: iota 20
    == [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
    >> new-data: shuffled test-data
    == [6 10 9 12 20 19 15 13 7 11 2 5 18 8 16 14 3 1 17 4]

This version sorts a block of weird-looking integers which can
then have unique index values remaindered out to build the
result from the input block.

I believe it to be a significant speed improvement (although I
haven't had the time to benchmark it ;-) because:

1)  It sorts a block of INTEGER! values.  I suspect this is the
    fastest thing we can sort (but I'm sure Holger will let us
know if I'm wrong on this detail).

2)  It eliminates the need for the "call-back" to a user-written
    comparison function.

3)  It will call RANDOM (within RANDOM-IOTA) exactly once for
    each element in the original block.

Hope this helps!

-jn-
  ___     ___                                                   ___
\/ 2  + \/ 2  =  4  (for sufficiently large approximations of \/ 2 )

                                     joel'dot'neely'at'fedex'dot'com