Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.
I am the chief technical officer at InVision App, Inc - a prototyping and collaboration platform for designers, built by designers. I also rock out in JavaScript and ColdFusion 24x7.
Meanwhile on Twitter
Loading latest tweet...
Ben Nadel at cf.Objective() 2011 (Minneapolis, MN) with:

ColdFusion RandRange() vs. Java Collections Shuffle()

By Ben Nadel on
Tags: ColdFusion

Sometimes, I get the feeling that ColdFusion's RandRange() function just isn't working properly. This of course is not based on anything scientific. In fact, it's mostly based on the fact that manually refreshing a page doesn't render a seemingly random value (like I said, not scientific). So, I wanted to try and compare it to Java's Collections' Shuffle() method, to see if one of the results was very different than the other.

To test this, I am creating an array of values generated by RandRange() and an array of values generated by Shuffle(). Then I am averaging both arrays as well as counting values to see if they both come out to be centered with an even distribution of values:

  • <!---
  • Create an array to hold the numbers selected via
  • the ColdFusion method RandRange().
  • --->
  • <cfset arrRandRange = ArrayNew( 1 ) />
  •  
  • <!---
  • Create an array to hold the numbers selected from
  • the first array or a randomly shuffed array (shuffed
  • via the Java.util.Collections library.
  • --->
  • <cfset arrShuffle = ArrayNew( 1 ) />
  •  
  •  
  • <!--- Set the number of iterations we are going to do. --->
  • <cfset intCount = 1000 />
  •  
  •  
  • <!---
  • Loop over the RandRange() funciton a bunch of
  • times to populate the array.
  • --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="#intCount#"
  • step="1">
  •  
  • <!---
  • Use RandRange() to pick a random number
  • between 1 and 3.
  • --->
  • <cfset ArrayAppend(
  • arrRandRange,
  • RandRange( 1, 3 )
  • ) />
  •  
  • </cfloop>
  •  
  •  
  •  
  • <!---
  • Now, let create an array that we are going to
  • shuffle before each random selection.
  • --->
  • <cfset arrForShuffle = ListToArray( "1,2,3" ) />
  •  
  • <!--- Create a collection utility object. --->
  • <cfset objCollections = CreateObject(
  • "java",
  • "java.util.Collections"
  • ) />
  •  
  • <!---
  • Loop over the shuffle function a bunch of
  • times to populate the array. Before we select
  • the number, we are going to shuffle the array
  • from which we are selecting.
  • --->
  • <cfloop
  • index="intI"
  • from="1"
  • to="#intCount#"
  • step="1">
  •  
  • <!--- Shuffle the selection array. --->
  • <cfset objCollections.Shuffle(
  • arrForShuffle
  • ) />
  •  
  • <!---
  • When populating the value array, just select
  • the first value in the shuffled array.
  • --->
  • <cfset ArrayAppend(
  • arrShuffle,
  • arrForShuffle[ 1 ]
  • ) />
  •  
  • </cfloop>
  •  
  •  
  •  
  • <!---
  • Now that we have populated both arrays, let's get
  • the average of all the values. Theoretically, they
  • should both be just about 2 since the randomness
  • should be somewhat evenly distributed.
  • --->
  • RandRange: #ArrayAvg( arrRandRange )#
  • Shuffle: #ArrayAvg( arrShuffle )#

As you can see, I am adding RandRange() values to one array. Then, I am shuffling another array and selecting the first value from it to populate the Shuffle() array. Both loops randomly select the values 1,2, or 3. Running the above code, the numbers always come out to be roughly the same:

RandRange: 2.016
Shuffle: 1.987

RandRange: 1.992
Shuffle: 1.992

RandRange: 1.987
Shuffle: 1.986

RandRange: 1.957
Shuffle: 1.997

But, what about the value distribution? 1000 2s could look the same as 500 1s and 3s. To make sure the counts were evenly distributed, I created an ArrayCount() method that takes an array and a value and returns the number of times that (simple) value can be found in the array:

  • <cffunction
  • name="ArrayCount"
  • access="public"
  • returntype="numeric"
  • output="false"
  • hint="Returns the number of times the given value can be found in the given array.">
  •  
  • <!--- Define arguments. --->
  • <cfargument
  • name="Array"
  • type="array"
  • required="true"
  • hint="The array we will be searching."
  • />
  •  
  • <cfargument
  • name="Value"
  • type="string"
  • required="true"
  • hint="The value we will be counting."
  • />
  •  
  • <!--- Define the local scope. --->
  • <cfset var LOCAL = StructNew() />
  •  
  • <!--- Start the default value count. --->
  • <cfset LOCAL.Count = 0 />
  •  
  •  
  • <!--- Loop over the array looking for the value. --->
  • <cfloop
  • index="LOCAL.Index"
  • from="1"
  • to="#ArrayLen( ARGUMENTS.Array )#"
  • step="1">
  •  
  • <!---
  • Check to see if the current index contains
  • a matching value for what we are searching.
  • --->
  • <cfif (ARGUMENTS.Array[ LOCAL.Index ] EQ ARGUMENTS.Value )>
  •  
  • <!--- We found a matching value. --->
  • <cfset LOCAL.Count = (LOCAL.Count + 1) />
  •  
  • </cfif>
  •  
  • </cfloop>
  •  
  •  
  • <!--- Return the value count. --->
  • <cfreturn LOCAL.Count />
  • </cffunction>
  •  
  •  
  •  
  • RandRange
  • ArrayCount( 1 ): #ArrayCount( arrRandRange, 1 )#
  • ArrayCount( 2 ): #ArrayCount( arrRandRange, 2 )#
  • ArrayCount( 3 ): #ArrayCount( arrRandRange, 3 )#
  •  
  •  
  • Shuffle
  • ArrayCount( 1 ): #ArrayCount( arrShuffle, 1 )#
  • ArrayCount( 2 ): #ArrayCount( arrShuffle, 2 )#
  • ArrayCount( 3 ): #ArrayCount( arrShuffle, 3 )#

Again, running the above code shows a very even distribution of values:

RandRange
ArrayCount( 1 ): 316
ArrayCount( 2 ): 317
ArrayCount( 3 ): 367

Shuffle
ArrayCount( 1 ): 316
ArrayCount( 2 ): 343
ArrayCount( 3 ): 341

RandRange
ArrayCount( 1 ): 360
ArrayCount( 2 ): 321
ArrayCount( 3 ): 319

Shuffle
ArrayCount( 1 ): 320
ArrayCount( 2 ): 356
ArrayCount( 3 ): 324

RandRange
ArrayCount( 1 ): 323
ArrayCount( 2 ): 328
ArrayCount( 3 ): 349

Shuffle
ArrayCount( 1 ): 335
ArrayCount( 2 ): 327
ArrayCount( 3 ): 338

For everything above, I was testing on 1000 iterations. I thought maybe that large set is just easy to even out on. But, when I changed from 1000 iterations down to 10 iterations, I found exactly the same average and random number distribution. So, I guess the lesson learned here is that RandRange() is selecting numbers randomly and I am just paranoid :)



Looking For A New Job?

100% of job board revenue is donated to Kiva. Loans that change livesFind out more »

Reader Comments

I'm more of a visual person, so I'd prefer to see the data as ordered pairs on a chart to test randomness.

<cfset statsAry = arrayNew(1) />
<cfloop from="1" to="30" index="i">
<cfset arrayAppend(statsAry,randRange(1,10) & ',' & randRange(1,10)) />
</cfloop>
<cfset arraySort(statsAry,'numeric') />
<cfchart format="png" name="statsGraph">
<cfchartseries type="scatter">
<cfloop from="1" to="#arrayLen(statsAry)#" index="i">
<cfchartdata item="#listFirst(statsAry[i])#" value="#listLast(statsAry[i])#">
</cfloop>
</cfchartseries>
</cfchart>
<cfheader name="Content-Type" value="image/png">
<cfheader name="Content-Disposition" value="inline; filename=statsGraph.png">
<cfcontent variable="#statsGraph#">

Reply to this Comment

Hrmm... there went my tabs, oh well. Anyway, small bug in that code:

This:
<cfchartseries type="scatter">
<cfloop from="1" to="#arrayLen(statsAry)#" index="i">
<cfchartdata item="#listFirst(statsAry[i])#" value="#listLast(statsAry[i])#">
</cfloop>
</cfchartseries>

Should be:
<cfloop from="1" to="#arrayLen(statsAry)#" index="i">
<cfchartseries type="scatter">
<cfchartdata item="#listFirst(statsAry[i])#" value="#listLast(statsAry[i])#">
</cfchartseries>
</cfloop>

Reply to this Comment

Either I'm retarded or I must have not had enough coffee this morning (I'd prefer to think the latter).

For some reason sorting the array as 'numeric' works for having the ordered pairs w/ randRange <10, but not more. Not sure why thats the case, but when I change my values to >10 and (sometimes) it tosses an error on random elements (go figure) saying it was string. Obviously the array was completely filled with strings before too, but why no error? Any ideas?

Fixed code:
<cfset statsAry = arrayNew(1) />
<cfloop from="1" to="30" index="i">
<cfset arrayAppend(statsAry,randRange(1,100)) />
</cfloop>
<cfset arraySort(statsAry,'numeric') />
<cfloop from="1" to="30" index="i">
<cfset statsAry[i] = statsAry[i] & ',' & randRange(1,100) />
</cfloop>
<cfchart format="png" name="statsGraph">
<cfloop from="1" to="#arrayLen(statsAry)#" index="i">
<cfchartseries type="scatter">
<cfchartdata item="#listFirst(statsAry[i])#" value="#listLast(statsAry[i])#">
</cfchartseries>
</cfloop>
</cfchart>
<cfheader name="Content-Type" value="image/png">
<cfheader name="Content-Disposition" value="inline; filename=statsGraph.png">
<cfcontent variable="#statsGraph#">

Reply to this Comment

Ben,

I have often felt the same way. So, we are both paranoid!

Running your tests in Scribble pad in CFE was good salve for my paranoia though.

Thanks,

Chris P

Reply to this Comment

@Chris,

I think part of the problem is that there is really no such thing as a truly random number. As such, the number needs to be generated based on some sort of algorithm and seed value (I think). I had thought hat perhaps it was using the internal clock to help generate those values, which might explain the refresh-page technique not showing more seemingly random values.... but I don't really understand how random number generation works, so I don't know how any of it is tied together.

Another interesting experiment would be to see how time affects a random number. Maybe a pattern could be found based on milliseconds of the internal clock.

Of course, I really have no idea what I am talking about :)

Reply to this Comment

@Steve,

Interesting. I don't think I have used this before. I believe that there is something similar in Javascript that I have used, but never in ColdFusion. I wonder if this affects the outcome of RandRange(), which is the randomization function that I use most often. I can't imagine that RandRange() isn't just sitting on top of the built-in randomization methods.

Reply to this Comment

I know this is old, and you've probably figured this out by now, but for completion's sake (and because I was searching for reasons to use Randomize or not) ...

Yes, Randomize() affects RandRange() as well as Rand().

(all this was tested in CF 8.0.1, Standard Edition)

Calling Randomize with the same seed and the default algorithm will generate the same sequence with RandRange for every call; the sets of random numbers will all be the same.

Calling Randomize with the same seed and the SHA1PRNG algorithm will generate the same set of sequences with RandRange; the sets will be different but the contents of each set will be the same.

Example:
Call RandRange(1, 100) 10 times
10 random numbers, call it set A
Call Randomize(123)
Call RandRange(1, 100) 10 times
10 "random" numbers, call it set B
Call Randomize(123)
Call RandRange(1, 100) 10 times
10 "random" numbers, set B again
Call RandRange(1, 100) 10 times
10 "random" numbers, call it set C

If you refresh this page, set A will change; sets B and C will not. Sets A, B, and C are all different.

Call RandRange(1, 100) 10 times
10 random numbers, call it set W
Call Randomize(123, "SHA1PRNG")
Call RandRange(1, 100, "SHA1PRNG") 10 times
10 "random" numbers, call it set X
Call Randomize(123, "SHA1PRNG")
Call RandRange(1, 100, "SHA1PRNG") 10 times
10 "random" numbers, call it set Y
Call RandRange(1, 100, "SHA1PRNG") 10 times
10 "random" numbers, call it set Z

If you refresh this page, set W will changes; sets X, Y, and Z will not. Sets W, X, Y, and Z are all different. (Note that seeding the RNG with the same seed and the SHA1PRNG algorithm produces a different "starting point" for random numbers for each seeding.)

Of course this does not change the fact that these are still pseudo-random numbers (they're still sequences) ... using the SHA1PRNG algorithm just generates the numbers differently.

But ... why use Randomize if you need to provide a seed for it and the sets of numbers prior to using Randomize change each time?

That I'm not sure about. When I tested this, even when I created separate pages and took out the SHA1PRNG calls, a new page with just RandRange on it produced different sets of numbers, even if the old page still had sets A, B, B, C each time.

Reply to this Comment

@Dave,

I think one of the ideas behind Randomize() is that you can use a user-entered value so that the user can help create a more random outcome.

Reply to this Comment

Just had a customer wonder why they were seeing one student's submission when the submission was sent from another student. We have a function that creates a random 7 alphanumeric character string. It uses 4 calls to randrange doing different things...well, two students creating the submissions at about the same time (2 seconds apart) created the same string and the quick search of the db of past strings didn't find them because, I guess, they were too close together to be found.

Hard to explain that to a customer!

Thanks guys.

Reply to this Comment

@Quigtj,

Yeah, the time-based randomness of RandRange() can definitely be frustrating. I have found it especially irritating when generating large amounts of "random" test data.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Comments
Live in the Now
Oops!
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.