ColdFusion CFQuery MaxRows Not A Bad Compromise When TOP Not Available

Posted January 8, 2007 at 7:53 PM

Tags: ColdFusion, SQL

Not so long ago, I was introduced to the MaxRows attribute of the ColdFusion CFQuery tag (thanks Joe Rinehart). It's one of those things that you can't believe you didn't know about, but sure enough, there it is. If anyone else doesn't know what it does, it limits the number of records that can be returned by a given query. So, for instance, the following query:

 Launch code in new window » Download code as text file »

  • <!--- Limit the record set using MaxRows. --->
  • <cfquery name="qMovie" datasource="..." maxrows="30">
  • SELECT
  • *
  • FROM
  • movie
  • ORDER BY
  • name ASC
  • </cfquery>

... will only return the TOP 30 rows to the ColdFusion memory space since it's MaxRows attribute is set to "30." This is similar to the way the TOP directive works in SQL server:

 Launch code in new window » Download code as text file »

  • <!--- Limit the record set using TOP. --->
  • <cfquery name="qMovie" datasource="...">
  • SELECT TOP 30
  • *
  • FROM
  • movie
  • ORDER BY
  • name ASC
  • </cfquery>

Both the above queries accomplish the same thing by limiting the max number of returned records to 30. However, one is a ColdFusion directive (MaxRows) and one is a SQL directive (TOP). When compared though, MaxRows does pretty well. When I "select *" from a database that has 50,000+ records, using the TOP directive executed in about 30-40 ms. MaxRows returned the same record set in around 125-140 ms. This is about 3-4 times slower... but when you compare that to the query that ran without either the TOP or the MaxRows directive (on average 400-450 ms), it's a pretty darn good compromise when the SQL TOP directive is not available.

I don't know why MaxRows executes faster than the query that has neither TOP nor MaxRows. I assume that ColdFusion is basically cutting off the database stream once the "MaxRows" number of records have been returned (but this is just theory). I wonder how this affects the SQL Server. I wonder if, from the SQL standpoint, it is still returning all those rows? Does the SQL server still have to amass the massive record set?

Efficiency aside, MaxRows is sometimes the only option available, such as when performing a ColdFusion query of queries (which is where Joe jumped in and pointed out my ignorances).

Download Code Snippet ZIP File

Comments (6)  |  Post Comment  |  Ask Ben  |  Permalink  |  Other Searches  |  Print Page



Adobe ColdFusion 8.0.1 Update - Helping Programmers To Be Signifanctly Less Girlie - Download ColdFusion 8 Update 8.0.1 Now.

Reader Comments

I know that OLEDB, and I think also ODBC, can tell the SQL Server that it only wants the first n rows, and I know that the SQL Server will optimize for that. (I think I remember reading something about SQL Server even being smart enough to throw away unneeded data mid-query once it has fulfilled it's minimum result set.)

And, of course, always remember that Postgres and mySQL use the LIMIT n syntax, DB2 uses the FETCH FIRST n ROWS ONLY syntax, and Oracle uses ROWNUM in the WHERE clause.

PITA, eh?

Posted by Rick O on Jan 8, 2007 at 8:56 PM


What? They couldn't come up with a way to make "FETCH FIRST n ROWS ONLY" even longer? Maybe throw a "please" at the end :)

Rick, thanks for the insight. Good to know that optimization does occur and that it occurs in both the ColdFusion server and the SQL server (potentially).

Posted by Ben Nadel on Jan 9, 2007 at 7:32 AM


Here is how I would do it in MySQL

The First 10
Select * from table order by ID Limit 10

The Last 10

Select * from table order by ID Desc Limit 10

Posted by Randy Johnson on Jan 9, 2007 at 7:53 AM


Actually, it's even worse than that. For most queries via CF, it's a good idea to tell the DB2 engine that the query you are about to run is not going to be updated. It will avoid some locking problems and speed up your query by around 5%. In SQL server, it's pretty straightforward:

SELECT TOP 10 foo, bar
FROM mytable (NOLOCK)

How do you do it in DB2?

SELECT foo, bar
FROM mylib.mytable
FETCH FIRST 10 ROWS ONLY
FOR FETCH ONLY

Someone at IBM really, really liked the word "ONLY".

Posted by Rick O on Jan 9, 2007 at 11:08 AM


@Rick,

That is really interesting. I have never seen that before. Is that something you should do only if you are not expecting simultaneous hits to the DB? Or is that something you can do any time you are just reading?

Posted by Ben Nadel on Jan 9, 2007 at 12:07 PM


For NOLOCK, it allows your query to perform dirty reads, which may or may not be applicable to your situation. For data that changes very infrequently but is queried often (especially as part of a large join), dirty reads may improve performance by reducing table contention.

But!

Dirty reads can also read data that never actually gets committed! If a transaction fails and is rolled back, a dirty read might get the bogus data. I only use it for lookup tables that almost never change.

Done completely in DB2:

SELECT foo, bar
FROM mylib.mytable
FETCH FIRST 10 ROWS ONLY
OPTIMIZE FOR 10 ROWS
FOR FETCH ONLY
WITH UR

That "optimize" part kindof acts like the server-size equivalent of CFQuery's BLOCKFACTOR attribute, while the "UR" part represents "uncommitted read", the dirty part.

DB2 is nothing if not verbose. It may seem like over-optimization, but it's good to know how to do just in case.

Posted by Rick O on Jan 9, 2007 at 1:09 PM


Post Comment  |  Ask Ben


Home   |   Web Log   |   ColdFusion   |   Projects   |   Resume   |   Job Form   |   Search   |   Contact
Epicenter Consulting - Custom Software Solutions for Business Evolution HostMySite.com - The Leader In ColdFusion Hosting