Yesterday, in response to my post about [SQL queries and code duplication], Phill Nacelli demonstrated that you could keep the "shape" of a prepared statement consistent even with a dynamic set of query criteria. I had never seen this approach before and I found it completely fascinating. The benefit of a consistently-shaped prepared statement is that MySQL only has to parse it once. Then, on subsequent calls, it uses the cached version regardless of which parameters are passed-through. This reduces the number of queries that MySQL has to keep in its cache and should provide some degree of performance improvement. As such, I thought it was worth taking a closer look at this concept.
The key to Phill's approach is short-circuit evaluation within the
WHERE clause. Rather than excluding a dynamic filter from the query on the ColdFusion server, he includes it in the prepared statement but causes it to be short-circuited during evaluation. This leads to a prepared statement that always has the same "shape" and number of parameters.
In his comments, Phill used ColdFusion's CFScript syntax for his query; however, I still write the vast majority of my "database gateway layer" using CFML Tags. As such, I'll be using Tags in my exploration.
ASIDE: I use Tags for my queries specifically for the ability to optionally include sub-statements in a query. So, in theory, this approach to consistently-shaped prepared statements removes the need for Tags and would open up the opportunity for me to switch to the
To showcase this concept, I've created a
getByFilter() method that searches a
project table using several optional filters. Each filter is included in the query; but, may be skipped using a NULL-safe operator:
<cffunction name="getByFilter" output="false" returntype="query"> <!--- Define arguments. ---> <cfargument name="id" type="numeric" required="false" /> <cfargument name="companyID" type="numeric" required="false" /> <cfargument name="name" type="string" required="false" /> <cfquery name="local.results"> /* DEBUG: myGateway.getByFilter(). */ SELECT * FROM project p WHERE 1 = 1 -- Apply "id" filter. AND ( NULL <=> <cfqueryparam null="#isNull( id )#" value="" /> OR <cfif isNull( id )> id = <cfqueryparam null="true" /> <cfelse> id = <cfqueryparam value="#id#" sqltype="cf_sql_integer" /> </cfif> ) -- Apply "companyID" filter. AND ( NULL <=> <cfqueryparam null="#isNull( companyID )#" value="" /> OR <cfif isNull( companyID )> companyID = <cfqueryparam null="true" /> <cfelse> companyID = <cfqueryparam value="#companyID#" sqltype="cf_sql_integer" /> </cfif> ) -- Apply "name" filter. AND ( NULL <=> <cfqueryparam null="#isNull( name )#" value="" /> OR <cfif isNull( name )> name LIKE <cfqueryparam null="true" /> <cfelse> name LIKE <cfqueryparam value="%#name#%" sqltype="cf_sql_varchar" /> </cfif> ) </cfquery> <cfreturn results /> </cffunction>
<=>is the NULL-safe comparison operator in MySQL. This will return
NULL <=> NULL, unlike the normal comparison operator,
=, which will return
NULL = NULL.
As you can see, I am including a
WHERE condition for each potential filter, even if there is no corresponding
cfargument to apply. This approach is a little more verbose when compared to my default strategy for dynamic filtering. But, what this does do is create a prepared statement that always looks like this:
/* DEBUG: myGateway.getByFilter(). */ SELECT * FROM project p WHERE 1 = 1 -- Apply "id" filter. AND ( NULL <=> ? OR id = ? ) -- Apply "companyID" filter. AND ( NULL <=> ? OR companyID = ? ) -- Apply "name" filter. AND ( NULL <=> ? OR name LIKE ? )
As you can see, the shape of the SQL statement is always the same. What makes the filtering dynamic is the fact that the NULL-check -
NULL<=>? - for each property will conditionally short-circuit the
OR condition, causing the given field evaluation to be skipped.
Now, no matter which filters are being applied to the query, the same prepared statement is being sent to MySQL; which has only to reach into its cache in order to understand the query execution plan that it has to run.
One concern that I had with this approach is that it my affect which indices are used when executing the query. But, if I run an
EXPLAIN on a query that only uses the
companyID, I get the following evaluation:
As you can see, even with the presence of
name conditions (short-circuited), the MySQL query planner is intelligent enough to know that it should choose the index for the only column that will actually affect the outcome,
I find this approach to be absolutely fascinating. Regardless of the decreased cache pressure and the potential for a performance improvement (albeit small), I think there's a certain elegance to keeping the shape of the query consistent. Not to mention the fact that a consistent shape may obviate the need for me to be writing my SQL queries using Tags. This approach - having nothing to do with SQL itself - opens up the opportunity for me to go full-CFScript in my ColdFusion code.
This is great, Ben. Like you I've still got a lot of query code in tags for the same reason.
I think you could simplify the conditionals by replacing:
<cfif isNull( id )> id = <cfqueryparam null="true" /> <cfelse> id = <cfqueryparam value="#id#" sqltype="cf_sql_integer" /> </cfif>
id = <cfqueryparam value="#id#" sqltype="cf_sql_integer" null="#isNull( id )#" />
I tried that at first, but you run into issues with
#id#. If the argument is
null, the attempt
to interpolate the
value throws an error :(
That said, one option could be to default the argument to a
non-active value. So, in the case of the
id, something like:
<cfargument name="id" type="numeric" required="false" default="0" />
... then, instead of checking for
isNull(), you could
check for it as a "falsey" value:
id = <cfqueryparam value="#id#" sqltype="cf_sql_integer" null="#( ! id )#" />
That works well for numeric types; but, it might get tricky
with other data types, like dates. I think you can default a
0 as well, and possibly do the same thing (since
dates kind of cast seamlessly to numbers); but, I'd have to check that
before I can confirm.
You're right, Ben, I should have tested before posting.
You can indeed use
0 as a date default but I think I'd
prefer using an Elvis to an argument default:
id = <cfqueryparam value="#id?:0#" sqltype="cf_sql_date" null="#isNull( id )#" />
Tested this time :-)
PS: In Lucee at least you don't need the
id = <cfqueryparam value="#id?:0#" sqltype="date" null="#isNull( id )#" />
OH BRO!!! What a perfect use of the Elvis operator!! To be honest, I had completely forgotten that this operator even existed because I never really know where to use it. That is brilliant.
Also, good to know that I can dump the
cf_sql_ prefix. I
didn't know that. I only recently saw in the docs that you can use
sqltype in lieu of
cfsqltype; but, didn't
realize the value was also modified.
So much value add :D