I was thinking about how to be answer a new ColdFusion-based "Ask Ben" question about a rating system when I thought about creating numeric averages. All my life, when creating an average, I followed the simple formula of dividing the sum of a collection by the number of its entries:
Average = Sum / N
This, of course, requires you to have both the sum and the count of a collection of values. But what if we wanted to keep track of a set average as the set of values changed over time (such as the ratings in a rating system). Unless I was storing user-specific data with each individual rating, I wouldn't want to have to store each rating individually. So this got me thinking about weighted averages. I wondered, would it be possible to augment an average simply by averaging a new value into the average in a weighted manner. Meaning, given an average value based on N numbers, could I simply average in a new number by giving the current average a weight of N and the new number a weight of 1?
I am sure that anyone with a decent background in math is reading this and saying, "D'uh," but to me, this question was not immediately obvious. As such, I ran some tests to see if this worked:
<!--- Create an array in which to hold our original set or random numbers (for which we will be finding an average). ---> <cfset randomNumbers =  /> <!--- Now, let's create some random numbers to store in the array. We're going to keep the number relatively small since a small set will be influenced more per each numbers. ---> <cfloop index="index" from="1" to="10" step="1"> <!--- Add a new, random number to the collection. ---> <cfset arrayAppend( randomNumbers, randRange( 1, 10 ) ) /> </cfloop> <!--- At this point, we have our number collection. Let's figure out the average of this collection, before we add anything new. (NOTE: getting count for use later). ---> <!--- Get the number of random numbers we created. ---> <cfset randomCount = arrayLen( randomNumbers ) /> <!--- Get the current sum of the collection. ---> <cfset baseAverage = arrayAvg( randomNumbers ) /> <!--- Now, let's create a new random number that we want to use to update our base average. ---> <cfset newNumber = randRange( 1, 10 ) /> <!--- At this point, we have two options: 1. We can add the new number to the existing collection and then take a new average. 2. We can take the existing average of the collection and then combine it with the new number in a *weighted* fashion such that we only need the average and the count. ---> <!--- Method 1: Add number to the collection. ---> <cfset arrayAppend( randomNumbers, newNumber ) /> Method 1 Average: #arrayAvg( randomNumbers )#<br /> <!--- Method 2: Create weighted average based on count. ---> <cfset newAverage = ( ((baseAverage * randomCount) + newNumber) / (randomCount + 1) ) /> Method 2 Average: #newAverage#<br />
As you can see in the code, I am creating a collection of 10 random numbers and then an 11th random number. To figure out how the 11th number changes the average of the first 10, I try two different methods:
Simply adding the 11th number to the existing collection and dividing by 11.
Multiplying the average of the first 10 by 10 and then adding the 11th (creating a weighted sum) and then dividing by 11.
I ran this code a few times to make sure nothing happened by coincidence:
Method 1 Average: 6.27272727273
Method 2 Average: 6.27272727273
Method 1 Average: 6
Method 2 Average: 6
Method 1 Average: 5.27272727273
Method 2 Average: 5.27272727273
Method 1 Average: 3.81818181818
Method 2 Average: 3.81818181818
Method 1 Average: 6.45454545455
Method 2 Average: 6.45454545455
As you can see, after several runs, both average-augmentation methods come up with the same value each time. This is really awesome (and again, I'm sure obvious to the more Mathletic among you)! What this means is that if we need to grow an average over time, we don't actually need to store the individual values - we only need to store the set size and the set average at any given time. This should make my new "Ask Ben" answer a bit more straightforward!
Want to use code from this post? Check out the license.