SQL IN Directive Much Faster Than Multiple OR Clauses

Posted July 27, 2006 at 4:21 PM

Tags: ColdFusion, SQL

I use the IN directive a lot in SQL, especially in conjunction with ValueList()'s of IDs. I was working on a query today that was dealing with thousands of IDs and I got curious as to how the IN directive compared to a more labor-intensive grouping or OR clauses. To me, the IN directive is elegant, but if it's slow, I wouldn't mind writing out the ORs.

To test this, I queried the page-hit IDs from a web statistics database and then used that list of IDs to basically re-query the database for the same records:

 Launch code in new window » Download code as text file »

  • <!--- Query for the hit IDs. --->
  • <cfquery name="qIDs" datasource="..." username="..." password="...">
  • SELECT
  • h.id
  • FROM
  • web_stats_hit h
  • </cfquery>
  •  
  •  
  • <!--- Query for the ID records using the IN directive. --->
  • <cfquery name="qInTest" datasource="..." username="..." password="...">
  • SELECT
  • h.id,
  • h.date_created
  • FROM
  • web_stats_hit h
  • WHERE
  • <!---
  • In the ValueList(), I am passing in comments in an attempt
  • to make the amount of SQL larger so that it compares to the
  • next query which physically has more character data.
  • --->
  • h.id IN ( #ValueList( qIDs.id, ", /*..*/" )# )
  • </cfquery>
  •  
  •  
  • <!--- Query for the ID records using an OR clause for every ID. --->
  • <cfquery name="qOrTest" datasource="..." username="..." password="...">
  • SELECT
  • h.id,
  • h.date_created
  • FROM
  • web_stats_hit h
  • WHERE
  • 1 = 1
  • <!---
  • Loop over IDs. I am doing a 'sloppy' loop here to create
  • as little white space as possible so as to not have to push
  • so much darn data to the SQL server.
  • --->
  • <cfloop query="qIDs">OR id=#qIDs.id# </cfloop>
  • </cfquery>

The web stats database returns about 3,500 records. In the first query, I am using the IN directive and am just passing in a comma-delimited list of IDs. I am also throwing some bunk comments into the delimiter list so that the actually, physical size of the first query is comparable to that of the second. Value lists just take up less character data.

It turns out that the IN directive is MUCH faster. For the examples above, the IN test runs on average at just under 1,000 ms where as the OR test runs in just over 6,000 ms. If I take away the comments in the ValueList() call (ex. #ValueList( qIDs.id )#), then the IN test query drops to about 300 ms - a third the parsing and execution time.

So, it seems that raw character data accounts for a little bit of the speed of the query as the removing of the comments (/*..*/) demonstrates, but it looks like IN is just much faster than multiple ORs. This of course, only tests numeric data. I would have to come up with a slightly different test to see about string list data. Still, as I am using ID lists more often, this is very very interesting.

Download Code Snippet ZIP File

Comments (0)  |  Post Comment  |  Ask Ben  |  Permalink  |  Other Searches  |  Print Page




Reader Comments

There are no comments posted for this web log entry.


Post Comment  |  Ask Ben


Home   |   Web Log   |   ColdFusion   |   Projects   |   Resume   |   Job Form   |   Search   |   Contact
Epicenter Consulting - Custom Software Solutions for Business Evolution HostMySite.com - The Leader In ColdFusion Hosting