Using java.util.Collections To Shuffle A ColdFusion Query Column Corrupts Column Values

Posted November 8, 2011 at 10:08 AM by Ben Nadel

Tags: ColdFusion

The other day, I wanted to see if I could use the java.util.Collections class to help shuffle the values in a ColdFusion query. As I've blogged about before, the Collections class presents a shuffle() method which can be used to randomly sort Java collections (ie. ColdFusion arrays). And, since ColdFusion query objects represent columns as collections, I wanted to see if the Collections class could shuffle the row values in a ColdFusion query column. Unfortunately, using shuffle() on a query column appears to corrupt the row values with each pass.

To demonstrate this corruption, I'm going to build a ColdFusion query object with a single column that has numeric values. Then, I'm going to loop 5 times, passing the ColdFusion query column to the shuffle() method with each iteration:

  • <!--- Create a query that we will sort. --->
  • <cfset values = queryNew( "" ) />
  •  
  • <!--- Add the ID column. --->
  • <cfset queryAddColumn(
  • values,
  • "id",
  • "cf_sql_integer",
  • listToArray( "1,2,3,4,5" )
  • ) />
  •  
  •  
  • <!--- Loop a few times to shuffle the column. --->
  • <cfloop
  • index="i"
  • from="1"
  • to="5"
  • step="1">
  •  
  •  
  • <!---
  • Shuffle the column.
  •  
  • NOTE: I am using BRACKET notation so as to refer to the
  • column as a collection (rather than just the first value
  • in the column).
  • --->
  • <cfset createObject( "java", "java.util.Collections" ).shuffle(
  • values[ "id" ]
  • ) />
  •  
  • <!--- Output the shuffled query column. --->
  • <cfdump
  • var="#values#"
  • label="Shuffle (#i#)"
  • />
  •  
  • <br />
  •  
  •  
  • </cfloop>

Notice that when we pass the ColdFusion query column to the shuffle() method, we have to use the bracket notation:

  • values[ "id" ]

This allows us to refer to the column as a collection, not just the first row value (as would be denoted using the dot-notation).

When we run the above code, we get the following page output:


 
 
 

 
 Using java.util.Collections to shuffle() a ColdFusion query column slowly corrupts the column data. 
 
 
 

Notice that with each pass to the shuffle() method, the query column becomes more and more corrupt. But, it does so in a very specific way: the first row value is randomly duplicated within the column. And, while it might not be obvious from the given output, the first row value never changes. It is always "1". No matter how many times I refresh the demo, the first row is always 1. I suspect this is why the duplicate value is always "1" as well.

The java.util.Collections class is awesome. And, the shuffle() method works wonders on normal ColdFusion arrays (which are Java collections behind the scenes). But, it looks like something goes consistently wrong when trying to use a ColdFusion query column as a Java collection. What a strange bug!




Reader Comments

Nov 8, 2011 at 11:33 AM // reply »
270 Comments

@Ben,

Randomization is a sort, of sorts. And database vendors spend millions of dollars every year on optimizing sorts. So why not let the database do it?

Make one of the columns of a select a randomization formula, preferably using a built-in function seeded by time of day, and sort on that column.

I do that for a contractor search engine I created for the Federal Government, so that the search doesn't show favoritism to contractors who got into the database early-on.

As long as the random seed is preserved, you can even page your results. For example, by basing all randomizations on the original start date/time (the one where the first page of results got displayed), you don't get the same row showing up on two different pages (which implies other rows being missed).


Nov 8, 2011 at 11:53 AM // reply »
11,238 Comments

@WebManWalking,

If I had to randomly sort a query with data that was accessible from the database, then I'd probably use the database as much as possible (as you are saying). But, I just wanted to see how or IF this would work. I knew you could use ColdFusion array functions on query columns; but, I didn't know if this would hold true when interacting with the Java layer explicitly.

That said, I can think of reasons why you might need to alter a query after it has come back from the database. It would be cool if this was an available option.

@All,

I didn't link to any of these in the "related" blog posts, but since Steve brought up random query stuff, I figured I throw these out there:

http://www.bennadel.com/blog/1473-Seeding-SQL-RAND-Method-With-NEWID-For-Per-Row-Random-Values-Thanks-Joshua-Cyr-.htm

http://www.bennadel.com/blog/310-Ask-Ben-Getting-A-Random-Date-From-A-Date-Range-In-SQL.htm

To this day, I don't truly understand how binary data values get cast in SQL -- that's beyond my mental model.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 21, 2013 at 7:46 PM
Using Plupload For Drag & Drop File Uploads In ColdFusion
No luck. At least I have uncovered the cause, URLScan 3.1. Here is what I see in the IIS log when a file is over 30mb. 2013-05-21 23:29:05 10.105.45.128 GET /plupload/assets/jquery/jquery-1.8. ... read »
May 21, 2013 at 6:12 PM
Using Plupload For Drag & Drop File Uploads In ColdFusion
Ben, I did not see you after Pete Freitag's Lockdown session at cfObjective but he said that IIS sets file size limits at 30MB by default which just happened to be the threshold for file size when ... read »
May 21, 2013 at 11:51 AM
Ask Ben: Parsing Very Large XML Documents In ColdFusion
Looking at my first ever XML document that I have to parse and put into MS SQL 2000 with CF8. I get it to list the desired Field name, many times over, and have a long list of this field name displa ... read »
May 21, 2013 at 9:25 AM
Turning Off and On Identity Column in SQL Server
you are awesome..i am lucky to get this blog between such a garbage one....Thanks, Prashant ... read »
May 20, 2013 at 4:38 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Dana, Your confusion is well founded, since this is a very confusing features. In fact, it ONLY works if you use array notation. Meaning, that this: arrayToList( query[ "columnName" ] ) ... read »
May 20, 2013 at 4:34 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
I was thinking chicken and the egg, I wouldn't have expected it to work in the valuelist going in I guess. Maybe I just need a beer, long day :) ... read »
May 20, 2013 at 4:29 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Dana, That's if you're trying to reference a specific row. In this case, we're trying to reference the entire query column as one cohesive value. So, you are correct that if you wanted to output a ... read »
May 20, 2013 at 4:24 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
I thought when you used array notation to reference queries you always had to have the row or it would throw a similar error as well? ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools