SQL User Defined Functions Are Slower Than Inline Logic

Posted September 21, 2007 at 1:47 PM by Ben Nadel

Tags: SQL

In one of my other posts on comparative SQL performance and optimization, Jason Rushton stated that in his experience, using a SQL user defined function in a query performed up to 4 times slower than the equivalent inline logic. Gus also backed this up with his opinion that UDFs should be used only when necessary.

If you stop and think about it, user defined functions in any language are going to be slower than using the equivalent inline logic; it adds more processing instructions and overhead to the overall algorithm and more processing = more slower. To see this in action, I set up this little test:

  • <!---
  • Get all the IDs as strings using a SQL user
  • defined function, ToString().
  • --->
  • <cftimer
  • type="outline"
  • label="User Defined Function">
  •  
  • <!--- Start running total. --->
  • <cfset intTotal = 0 />
  •  
  • <!--- Loop over this method 10 times. --->
  • <cfloop index="i" from="1" to="10" step="1">
  •  
  • <!--- Query for IDs. --->
  • <cfquery name="qID" datasource="#REQUEST.DSN.Source#" result="objResult">
  • SELECT
  • dbo.ToString( b.id ) AS id_string
  • FROM
  • blog_entry b
  • </cfquery>
  •  
  • <!--- Output execution time of query. --->
  • #objResult.ExecutionTime# ms /
  • #qID.RecordCount# records
  • <br />
  •  
  • <!--- Update total. --->
  • <cfset intTotal = (intTotal + objResult.ExecutionTime) />
  •  
  • </cfloop>
  •  
  • <!--- Output average run time. --->
  • Average: #Fix( intTotal / 10 )#<br />
  •  
  • </cftimer>
  •  
  •  
  • <!--- Get all the IDs as strings using inline SQL. --->
  • <cftimer
  • type="outline"
  • label="Inline Logic">
  •  
  • <!--- Start running total. --->
  • <cfset intTotal = 0 />
  •  
  • <!--- Loop over this method 10 times. --->
  • <cfloop index="i" from="1" to="10" step="1">
  •  
  • <!--- Query for IDs. --->
  • <cfquery name="qID" datasource="#REQUEST.DSN.Source#" result="objResult">
  • SELECT
  • LTRIM( RTRIM( STR( b.id ) ) ) AS id_string
  • FROM
  • blog_entry b
  • </cfquery>
  •  
  • <!--- Output execution time of query. --->
  • #objResult.ExecutionTime# ms /
  • #qID.RecordCount# records
  • <br />
  •  
  • <!--- Update total. --->
  • <cfset intTotal = (intTotal + objResult.ExecutionTime) />
  •  
  • </cfloop>
  •  
  • <!--- Output average run time. --->
  • Average: #Fix( intTotal / 10 )#<br />
  •  
  • </cftimer>

Here, we are converting the row ID to a string, first by using the SQL user defined function, ToString() and then afterwards by putting the same logic inline. Running the above code, we get the following times:

SQL User Defined Function

178 ms
157 ms
165 ms
179 ms
139 ms
160 ms
160 ms
68 ms
154 ms
140 ms

Average time: 150 ms

Equivalent Inline SQL Logic

100 ms
142 ms
157 ms
100 ms
115 ms
156 ms
135 ms
156 ms
159 ms
100 ms

Average time: 132 ms

Note: I ran the code 10 times (listed above) giving me the average of 10 averages.

When running this on a 50,000+ record set, it looks like (as would be expected) the inline logic is performing better than the user defined function. I am not, however, seeing the 4x speed difference that Jason had mentioned. Of course, my logic is uber simple and the UDF logic is also very small.

While the difference here does not jump out and slap you on the ass, the difference in performance is there; as such, it is something to think about when you are optimizing your queries. Of course, you have to take into account the ease of coding vs. performance requirements. If putting logic inline gives you a 30 ms improvement on a report that gets run once a day, is that worth NOT getting the benefits of a UDF? Probably not. But if this is a query that either sees a huge performance hit, slows down the rest of a site, or gets run very often, every few milliseconds you can squeeze out can make a difference.



Reader Comments

Sep 21, 2007 at 2:22 PM // reply »
20 Comments

CFC's are slower than inline logic as well... ;)

<ducks>


Sep 21, 2007 at 2:34 PM // reply »
10,640 Comments

Ha ha ha :)


Sep 21, 2007 at 4:02 PM // reply »
35 Comments

Assembly is faster than CF

**Grabs Pie and throws**

DW


Ian
Sep 21, 2007 at 4:16 PM // reply »
7 Comments

My Mustang is faster than your Camero.


Sep 22, 2007 at 4:49 AM // reply »
6 Comments

User Defined Functions can also execute faster on average then inline logic if they are immutable and the database supports result caching.

However, the real problem of User Defined Functions is not the overhead in invoking them, but that from the perspective of the planner they usually behave like a black box. They do not have an adequate selectivity definition and they do not have an adequate cost definition. For a thorough explanation you should read Optimization of Queries with User-Defined Predicates http://ee.snu.ac.kr/~shim/tods.pdf


Sep 23, 2007 at 11:58 AM // reply »
10,640 Comments

@Jochem,

I have put that PDF in my "To Be Reviewed" folder. Skimming over it, it seems to have a lot of higher level math and things of that nature... gonna need to set aside some time to really pick through it. Thanks.


Sep 24, 2007 at 2:37 AM // reply »
132 Comments

I think it's important to note that a procedure invocation in a language like SQL need not be slower than the equivalent inline logic. Specifically when the logic is complex.

SQL is parsed (generally speaking), at runtime, and reparsed every time the query is executed unless the database is really smart about caching operations. Parameterizing the query usually helps significantly, and depending on the database, may provide *massive* performance boosts. For instance Oracle bogs down really fast unless you use bind params.

However, complex logic that can be wrapped up in a UDF can provide a real performance boost if the database can determine that the UDF can be compiled and stored and then later when a dynamic query is executed, that the database needs to recompile each time, that part of the query will be excluded from the recompile.

Nothing prevents a database from inlining the logic right into the real query if the database can determine the resulting (or lack of) side effects.

We should look at a lot of factors when deciding how to structure our database calls and how to handle queries, because there's a whole lot more going on than a simple function call that's going to get in the way of performance.

I'd certainly be more worried about indexes, materialized or indexed views, and most importantly the algorithms, such as set based solutions to complex tasks, instead of worrying about the constant time execution overhead that comes from UDF invocation.


Sep 24, 2007 at 7:22 AM // reply »
10,640 Comments

@Elliott,

All good points. I think I am still at a point where I don't have complex logic either inline or in UDFs. Perhaps I am just not experienced enough to see how much of my processing can be moved into SQL (as opposed to trying to leverage ColdFusion post-query).


Oct 17, 2007 at 8:18 PM // reply »
1 Comments

Elliot - when you say "Nothing prevents a database from inlining the logic right into the real query if the database can determine the resulting (or lack of) side effects", what factors go into allowing the database to determine a lack of "side effects" and what possible side effects could there be of a UDF?

I ask because we have a stored procedure that takes a @firstName parameter and does this

SELECT x, y, z
WHERE udf_CompareText(x, @name) = 1

to offload a multistep comparison process. Scaled down a bit, the udf looks like this:

CREATE FUNCTION udf_CompareText (@param varchar(50), @valueInTable varchar(50))

...

IF (
(@param = @valueInTable) OR
(charindex(@param, @valueInTable) > 0) OR
(charindex(@valueInTable, @param) > 0) OR
(soundex(@param) = soundex(@valueInTable))
) return 1

etc.

We have an index on the firstName column. Inlining this comparison is significantly faster than doing it in the UDF. Is there any way to help my database engine (SQL Server) to, as you say, inline the logic implicitly?


Feb 9, 2009 at 10:49 AM // reply »
1 Comments

I can confirm that UDFs (table functions - not scalar) are up to 4x slower than inline SQL logic. I just created an UDF with a simple select statement. It takes 60s+ to complete while the plain select statement is finished after 15s. I have no idea how this MAY happen.


Mar 9, 2009 at 4:34 AM // reply »
1 Comments

inline is prone to SQL Injection attack. Its more faster that other but more vulnerable to SQL injection attacks.


Post A Comment

Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.

Please review the following issues:

Author Name:


Author Email:

Author Website:

Comment:

Supported HTML tags for formatting: <strong>bold</strong>   <em>italic</em>   <code>code</code>







  • Help Wanted - Find Your Next ColdFusion Job
InVision App - Prototyping Made Beautiful With Prototyping Tools Ben Nadel's Company - Epicenter Consulting Recent Blog Comments
Feb 10, 2012 at 7:21 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
Update! Instead of $(eval(options.insertAfter)).after(data['insertData']); I now use: var ajaxNode = document.createElement('span'); var parent = $(eval(options.insertAfter))[0].parentNode; ... read »
Feb 10, 2012 at 6:18 PM
jQuery AJAX Strips Script Tags And Inserts Them After Parent-Most Elements
encountered this same, what I consider, jQuery bug last week. I'm building a site in which I load some content via AJAX. This content contains Linkedin share button placeholders which Linkedin API ne ... read »
Feb 10, 2012 at 11:30 AM
Cross-Origin Resource Sharing (CORS) AJAX Requests Between jQuery And Node.js
After you understand the concepts here, this is an awesome cheatsheet for enabling CORS in just about anything http://enable-cors.org/ ... read »
JM
Feb 10, 2012 at 9:10 AM
My Safari Browser SQLite Database Hello World Example
@Amy, Here is a very good tutorial on how to use JOIN: http://www.sqltutorial.org/sqljoin-innerjoin.aspx ... read »
Feb 10, 2012 at 4:42 AM
Building A Twitter-Inspired RESTful API Architecture In ColdFusion
This is great, very useful Ben. I spotted a small typo in the api.cgm listing: <cfthrow type="Unauthroized" /> Cheers Stefan ... read »
Feb 9, 2012 at 10:35 PM
CFDirectory Filtering Uses Pipe Character For Multiple Filters (Thanks Steve Withington)
I was wondering if there would be a filter you could apply so that you got everything but what you included in the filter. As in show me all docs that are not a .pdf. ... read »
Feb 9, 2012 at 10:29 PM
Learning ColdFusion 9: Application-Specific Data Sources
@Ben, No offence, but if people were really wanting advanced features they would be using a platform like ASP.NET MVC. CFML is so structurally compromised as a tag-based scripting language that ... read »
Feb 9, 2012 at 10:03 PM
Subversion - Cleanup Failed To Process The Following Paths
@Leviaguirre, do you still have problems with this? ... read »