Yesterday, in response to my post about SQL queries and code duplication, Phill Nacelli demonstrated that you could keep the "shape" of a prepared statement consistent even with a dynamic set of query criteria. I had never seen this approach before and I found it completely fascinating. The benefit of a consistently-shaped prepared statement is that MySQL only has to parse it once. Then, on subsequent calls, it uses the cached version regardless of which parameters are passed-through. This reduces the number of queries that MySQL has to keep in its cache and should provide some degree of performance improvement. As such, I thought it was worth taking a closer look at this concept.
The key to Phill's approach is short-circuit evaluation within the
WHERE clause. Rather than excluding a dynamic filter from the query on the ColdFusion server, he includes it in the prepared statement but causes it to be short-circuited during evaluation. This leads to a prepared statement that always has the same "shape" and number of parameters.
In his comments, Phill used ColdFusion's CFScript syntax for his query; however, I still write the vast majority of my "database gateway layer" using CFML Tags. As such, I'll be using Tags in my exploration.
ASIDE: I use Tags for my queries specifically for the ability to optionally include sub-statements in a query. So, in theory, this approach to consistently-shaped prepared statements removes the need for Tags and would open up the opportunity for me to switch to the
To showcase this concept, I've created a
getByFilter() method that searches a
project table using several optional filters. Each filter is included in the query; but, may be skipped using a NULL-safe operator:
<cffunction name="getByFilter" output="false" returntype="query"> <!--- Define arguments. ---> <cfargument name="id" type="numeric" required="false" /> <cfargument name="companyID" type="numeric" required="false" /> <cfargument name="name" type="string" required="false" /> <cfquery name="local.results"> /* DEBUG: myGateway.getByFilter(). */ SELECT * FROM project p WHERE 1 = 1 -- Apply "id" filter. AND ( NULL <=> <cfqueryparam null="#isNull( id )#" value="" /> OR <cfif isNull( id )> id = <cfqueryparam null="true" /> <cfelse> id = <cfqueryparam value="#id#" sqltype="cf_sql_integer" /> </cfif> ) -- Apply "companyID" filter. AND ( NULL <=> <cfqueryparam null="#isNull( companyID )#" value="" /> OR <cfif isNull( companyID )> companyID = <cfqueryparam null="true" /> <cfelse> companyID = <cfqueryparam value="#companyID#" sqltype="cf_sql_integer" /> </cfif> ) -- Apply "name" filter. AND ( NULL <=> <cfqueryparam null="#isNull( name )#" value="" /> OR <cfif isNull( name )> name LIKE <cfqueryparam null="true" /> <cfelse> name LIKE <cfqueryparam value="%#name#%" sqltype="cf_sql_varchar" /> </cfif> ) </cfquery> <cfreturn results /> </cffunction>
<=>is the NULL-safe comparison operator in MySQL. This will return
NULL <=> NULL, unlike the normal comparison operator,
=, which will return
NULL = NULL.
As you can see, I am including a
WHERE condition for each potential filter, even if there is no corresponding
cfargument to apply. This approach is a little more verbose when compared to my default strategy for dynamic filtering. But, what this does do is create a prepared statement that always looks like this:
/* DEBUG: myGateway.getByFilter(). */ SELECT * FROM project p WHERE 1 = 1 -- Apply "id" filter. AND ( NULL <=> ? OR id = ? ) -- Apply "companyID" filter. AND ( NULL <=> ? OR companyID = ? ) -- Apply "name" filter. AND ( NULL <=> ? OR name LIKE ? )
As you can see, the shape of the SQL statement is always the same. What makes the filtering dynamic is the fact that the NULL-check -
NULL<=>? - for each property will conditionally short-circuit the
OR condition, causing the given field evaluation to be skipped.
Now, no matter which filters are being applied to the query, the same prepared statement is being sent to MySQL; which has only to reach into its cache in order to understand the query execution plan that it has to run.
One concern that I had with this approach is that it my affect which indices are used when executing the query. But, if I run an
EXPLAIN on a query that only uses the
companyID, I get the following evaluation:
As you can see, even with the presence of
name conditions (short-circuited), the MySQL query planner is intelligent enough to know that it should choose the index for the only column that will actually affect the outcome,
I find this approach to be absolutely fascinating. Regardless of the decreased cache pressure and the potential for a performance improvement (albeit small), I think there's a certain elegance to keeping the shape of the query consistent. Not to mention the fact that a consistent shape may obviate the need for me to be writing my SQL queries using Tags. This approach - having nothing to do with SQL itself - opens up the opportunity for me to go full-CFScript in my ColdFusion code.
Want to use code from this post? Check out the license.