Randomly Selected vs. Evenly Distributed

Posted July 19, 2007 at 8:37 AM by Ben Nadel

In the past, I have complained about random number generation in ColdFusion. Others have also complained a bit about the randomness (or lack thereof) of my Use It Or Lose It emails. Outside of programming, I find random selection to be frustrating as well; my iPod seems to replay some songs a lot and NEVER chooses other ones. My WimAmp does the same thing. But this got me thinking; what's really bothering me? Is it random selection? Or is the lack of even distribution? Are the two independent, or can you not even consider one without the other?

Take, for example, the answer I posted for the ColdFusion Weekly Podcast on selecting random elements from an array. In that solution, the value that we were returning was randomly selected, but it was selected in such a way that no indexes were repeated until the whole array was exhausted. Is this really random selection? Yes, in terms of how each index was selected (random number generation), but really no, because each selection was influenced by the previous N selections. But is it an evenly distributed selection? Yes, each value in the array is selected once without repetition, therefore distributing our selection across the entire array.

In truly random number generation, each randomly generated number should be a completely independent entity, having nothing to do with any other number generating actions. Therefore, if we were randomly selecting values between 1 and 10, it is possible (but unlikely) to randomly select 5 ten thousand times in a row. But, when it comes to system interaction, just because something like this is random, it does not mean that it is a satisfactory outcome. I think this is because when people "say" random, what they "mean" is random AND evenly distributed.

The problem with random AND evenly distributed selections is really that they are no longer random and it requires some sort of data persistence. How does one go about doing this in a way that is more pleasing to the user without adding the overhead of complexity?

I have no idea, this is more of a rambling and a plea for suggestions.




Reader Comments

Jul 19, 2007 at 9:43 AM // reply »
42 Comments

If you need a greater randomness you can use an alternate random number algorithm (mx7):

randRange(1,10,'SHA1PRNG')

The docs even state: "SHA1PRNG: generates a number using the Sun Java SHA1PRNG algorithm. This algorithm provides greater randomness than the default algorithm"

http://livedocs.adobe.com/coldfusion/7/htmldocs/00000605.htm


db
Jul 19, 2007 at 9:47 AM // reply »
3 Comments

here's something to get you started -
http://www.coldfusionjedi.com/index.cfm/2006/8/16/ColdFusion-101-Picking-a-random-image-2


Jul 19, 2007 at 9:57 AM // reply »
110 Comments

I think the way I've handled it previously is to keep a running tab of all of the "randomly" selected values. Then run a comparison of the newly selected items, check them against the table, rinse-and-repeat until you reach however many you are looking for. As you get more data in there, it can slow the process down a bit though (as it's running more and more comparisons), but as it's a scheduled task and I'll never see it, I've never worried *too* much about the slowdown.

If you're using SQL, you could see if that returns a more evenly distributed random result
http://www.sqlteam.com/article/using-newid-to-randomly-sort-records


Jul 19, 2007 at 9:58 AM // reply »
42 Comments

While on the topic of randomness. If you're trying to retrieve random items from a database and you're using MS SQL, the NewID() function is a good method of doing it.

If you want all:
SELECT ID
FROM FOO
ORDER BY NEWID()

If you only need (n):
SELECT TOP 1 ID
FROM FOO
ORDER BY NEWID()


Jul 19, 2007 at 10:01 AM // reply »
10,638 Comments

@Dustin,

Thanks, I will check that out.

@DB,

That is a nice tutorial, but there he is storing things in the session. I am doing things that do not involve session and would be spread out over a time period that is longer than a session would last.


Jul 19, 2007 at 10:07 AM // reply »
8 Comments

Tecnhically speaking, there's no such thing as random in a computer. There are only calculations based on values that are difficult for a human to predict.

In fact, there are lines of thought that suggest that randomness is a myth in any scenario, computed or not. i.e. everything is caused by something, so there's nothing that's truly random - only things that appear to be random, because we cannot readily discern the cause.

But you're right in that randomness is not equivalent to even distribution, though as the number or iterations approaches infinity, they begin to resemble each other.

P.S. I'm not really this high-brow. I'm just having a moment.


Jul 20, 2007 at 1:06 AM // reply »
10 Comments

"In fact, there are lines of thought that suggest that randomness is a myth in any scenario, computed or not. i.e. everything is caused by something, so there's nothing that's truly random - only things that appear to be random, because we cannot readily discern the cause."

Maybe, until you look close enough that you are in the size range where quantum physics becomes dominant, where things are truly random to the point of being completely unpredictable no matter how good a measurement is.

http://en.wikipedia.org/wiki/Hardware_random_number_generator has a good summary.


Jul 20, 2007 at 7:03 AM // reply »
10,638 Comments

I should really go back and watch Jurassic Park :) I think Jeff Goldbum's character had a whole thing about Chaos Theory and random events. Plus I like watching movies.


Jul 20, 2007 at 7:13 AM // reply »
10 Comments

Here you go:

http://www.youtube.com/watch?v=dpDckbqhpW8


Jul 20, 2007 at 7:21 AM // reply »
10,638 Comments

Ha ha, that video is a little disturbing :)


Jul 20, 2007 at 12:52 PM // reply »
3 Comments

Computer generated randomness is not random, but rather Pseudorandom Per http://en.wikipedia.org/wiki/Pseudorandom:

"A pseudorandom process is a process that appears random but is not. Pseudorandom sequences typically exhibit statistical randomness while being generated by an entirely deterministic causal process. Such a process is easier to produce than a genuine random one, and has the benefit that it can be used again and again to produce exactly the same numbers, useful for testing and fixing software.

To date there is no known method to produce true randomness, because due to the very nature of randomness, any factor determining the outcome would mean that it is not random at all. The random number generation functions provided in many software packages are pseudorandom."

Quoting wikipedia because its convenient, not because its an authority on the subject.


Jul 20, 2007 at 2:12 PM // reply »
8 Comments

I think there's a whole fascinating discussion to be had on whether or not randomness exists, though that's not the intent of the thread.

More relevant is the notion of testing for randomness, which - by any definition - is impossible. If randomness means unpredictability, than any test that PREDICTS the outcome, even the nature of the outcome - more proves unrandomness than randomness. Going with the analogy of a million monkeys on a million typewriters eventually producing the works of shakespeare. That outcome - the works of shakespeare -would undoubtedly fail any tests for randomness, even though they were "randomly" generated.

Watchout! The gravitaional pull is pulling me in!

I once saw an episode of Numb3ers where the genius character was demonstrating randomness. He asked a group of 10 or so people to take up random positions in the room. Once everyone did, he pointed out that it was not random at all. Everyone was about the same distance from each other (evenly distributed), and said that if their positions were truly random you might have clusters of 2 or three people occupying the same area, or close to it.


Jul 20, 2007 at 2:23 PM // reply »
10,638 Comments

I'm waiting for someone to switch over to talking about the philosophy of free will ;)


Jul 23, 2007 at 8:16 AM // reply »
10,638 Comments

@Dustin,

Thanks for the tip on SHA1PRNG. I just did some testing:

http://www.bennadel.com/index.cfm?dax=blog:852.view

And it seems that SHA1PRNG does produce a more even distribution of evenly timed, randomly generated numbers.



Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Feb 3, 2012 at 10:49 PM
How I Got Node.js Running On A Linux Micro Instance Using Amazon EC2
Wow this was really helpful! Only thing I would add is you need to update your .bash_profile after you edit the secure_path. This is what I did: $ . ~/.bash_profile Otherwise, NPM won't be found. ... read »
Feb 3, 2012 at 10:14 PM
Pushing Base64-Encoded Images Over HTML5 WebSockets With Pusher And ColdFusion
@Ben, Just wanted to let you know that pusher are soon to start limiting sizes on messages. This was the detail that came through in the Feb dispatch: "However, we will soon be limiting the s ... read »
Feb 3, 2012 at 5:05 PM
Regular Expressions Make CSV Parsing In ColdFusion So Much Easier (And Faster)
I tried using your RegEx in my C# program, but it was matching an extra empty-string at the end and so I would end up with an extra field that doesn't exist, so I changed it to this: (^|,)("(?: ... read »
Feb 3, 2012 at 3:47 PM
ColdFusion Supports HTTP Verbs PUT And DELETE (As Well As GET And POST)
Josh Cyr posted this on Twitter just a little bit ago. Thought it was appropriate. http://stackoverflow.com/questions/1619152/how-to-create-rest-urls-without-verbs/1619677#1619677 ... read »
Feb 3, 2012 at 2:28 PM
Changing The Execution Context Of Your Self-Executing Function Blocks In JavaScript
@Michael, You definitely make a good point (and extra points for quoting movies - I love movies). When you use a return() statement to define the object's public API, it does provide a consistent a ... read »
Feb 3, 2012 at 2:04 PM
Changing The Execution Context Of Your Self-Executing Function Blocks In JavaScript
To quote Jurassic Park: "Just because you can doesn't mean you should". I completely, utterly disagree with the thought that this is more readable. Consider the current module pattern: if ... read »
Feb 3, 2012 at 1:10 PM
REST API Design Rulebook By Mark Masse
@Jordan, Yeah, WRML was created by Mark Masse (author of the book). I also found it to be a bit convoluted. I suppose it is intended to allow the Client to be able to programmaticaly respond to cha ... read »
Feb 3, 2012 at 1:08 PM
ColdFusion Supports HTTP Verbs PUT And DELETE (As Well As GET And POST)
@Jason, To be honest, I don't have good answers for that kinds of stuff. And, to the point, that is specifically why I *really* liked the REST API Design Rulebook by Mark Masse - he just cuts throu ... read »