SQL IN Directive Much Faster Than Multiple OR Clauses

Posted July 27, 2006 at 4:21 PM by Ben Nadel

Tags: ColdFusion, SQL

I use the IN directive a lot in SQL, especially in conjunction with ValueList()'s of IDs. I was working on a query today that was dealing with thousands of IDs and I got curious as to how the IN directive compared to a more labor-intensive grouping or OR clauses. To me, the IN directive is elegant, but if it's slow, I wouldn't mind writing out the ORs.

To test this, I queried the page-hit IDs from a web statistics database and then used that list of IDs to basically re-query the database for the same records:

  • <!--- Query for the hit IDs. --->
  • <cfquery name="qIDs" datasource="..." username="..." password="...">
  • SELECT
  • h.id
  • FROM
  • web_stats_hit h
  • </cfquery>
  •  
  •  
  • <!--- Query for the ID records using the IN directive. --->
  • <cfquery name="qInTest" datasource="..." username="..." password="...">
  • SELECT
  • h.id,
  • h.date_created
  • FROM
  • web_stats_hit h
  • WHERE
  • <!---
  • In the ValueList(), I am passing in comments in an attempt
  • to make the amount of SQL larger so that it compares to the
  • next query which physically has more character data.
  • --->
  • h.id IN ( #ValueList( qIDs.id, ", /*..*/" )# )
  • </cfquery>
  •  
  •  
  • <!--- Query for the ID records using an OR clause for every ID. --->
  • <cfquery name="qOrTest" datasource="..." username="..." password="...">
  • SELECT
  • h.id,
  • h.date_created
  • FROM
  • web_stats_hit h
  • WHERE
  • 1 = 1
  • <!---
  • Loop over IDs. I am doing a 'sloppy' loop here to create
  • as little white space as possible so as to not have to push
  • so much darn data to the SQL server.
  • --->
  • <cfloop query="qIDs">OR id=#qIDs.id# </cfloop>
  • </cfquery>

The web stats database returns about 3,500 records. In the first query, I am using the IN directive and am just passing in a comma-delimited list of IDs. I am also throwing some bunk comments into the delimiter list so that the actually, physical size of the first query is comparable to that of the second. Value lists just take up less character data.

It turns out that the IN directive is MUCH faster. For the examples above, the IN test runs on average at just under 1,000 ms where as the OR test runs in just over 6,000 ms. If I take away the comments in the ValueList() call (ex. #ValueList( qIDs.id )#), then the IN test query drops to about 300 ms - a third the parsing and execution time.

So, it seems that raw character data accounts for a little bit of the speed of the query as the removing of the comments (/*..*/) demonstrates, but it looks like IN is just much faster than multiple ORs. This of course, only tests numeric data. I would have to come up with a slightly different test to see about string list data. Still, as I am using ID lists more often, this is very very interesting.



Reader Comments

Jul 27, 2008 at 4:32 PM // reply »
1 Comments

Ben,

The "1=1 OR" in your where cause will return ALL records and probably explains the large difference in the run times.

I'm sure the IN is faster than multiple OR's anyway, but probably not to the degree that your test showed.


Jul 28, 2008 at 8:38 AM // reply »
11,238 Comments

@John,

Good catch! That was supposed to be "1 = 0". I probably just copied it over incorrectly as the tests came back with the same results.


Jul 28, 2008 at 8:38 AM // reply »
11,238 Comments

Oh, but I guess they would have anyway since there was no filtering on the original query.

Hmmm, nice catch.


Dec 9, 2010 at 3:30 AM // reply »
1 Comments

I've been wondering it myself and want to find out about it.
But in my test, the Or clause is faster than the In clause.

The number of records is millions (all data in the table not the result set. the result set should be hundreds to thousands)

But there's only 2 condition. Not many.

I suppose that if the case is few, Or will be better, while if the case is many, In will be better.

But, i'm not in condition to prove it though.
I haven't test it with many cause myself.

PS: i'm not sure there's will be response, since it's already very old.


Mar 23, 2012 at 2:22 PM // reply »
1 Comments

In most RDBMS, IN statements are rewritten at parse time as a list of OR predicates. So there should be no difference in performance.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
May 20, 2013 at 4:38 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Dana, Your confusion is well founded, since this is a very confusing features. In fact, it ONLY works if you use array notation. Meaning, that this: arrayToList( query[ "columnName" ] ) ... read »
May 20, 2013 at 4:34 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
I was thinking chicken and the egg, I wouldn't have expected it to work in the valuelist going in I guess. Maybe I just need a beer, long day :) ... read »
May 20, 2013 at 4:29 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Dana, That's if you're trying to reference a specific row. In this case, we're trying to reference the entire query column as one cohesive value. So, you are correct that if you wanted to output a ... read »
May 20, 2013 at 4:24 PM
Using A Dynamic Column Name With ValueList() In ColdFusion
I thought when you used array notation to reference queries you always had to have the row or it would throw a similar error as well? ... read »
May 20, 2013 at 11:45 AM
Using jQuery's Animate() Step Callback Function To Create Custom Animations
This is really useful. I found out that you don't actually have to use a dummy css property (surprisingly). To animate a property in a linear-gradient for instance I did this this.css('someLinearGra ... read »
May 20, 2013 at 10:51 AM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Josh, Oh snap! You're totally right! I'm not sure I've ever tried that. I did know that you can call a number of other array-methods on ColdFusion query columns: http://www.bennadel.com/blog/167 ... read »
May 20, 2013 at 10:45 AM
Using A Dynamic Column Name With ValueList() In ColdFusion
@Ben - I believe you can achieve the same functionality with ColdFusion's built in ArrayToList() function. ArrayToList( users[ "id" ] ); ... read »
May 20, 2013 at 10:21 AM
My Experience With AngularJS - The Super-heroic JavaScript MVW Framework
Is there any error logging and handling framework in angularjs, if not then in what way I can do this. ... read »
InVision App - Prototyping Made Beautiful With Prototyping Tools