As much as I love databases and writing SQL, the reality is, I'm not a "database admin". And, I don't have the best grasp of how databases work their magic under the hood. This leaves me open to many misunderstandings; one of which appears to be the way that the
CFQueryParam tags work with the MySQL JDBC Driver (MySQL Connector/J 5.1) when used in ColdFusion / CFML. I believe that my understanding of the caching and performance benefits have been way off-base.
To set the stage for this post, I've been thinking a lot about my use of the
CFQueryParam tag within
IN () clauses; and, specifically about how the
CFQueryParam tag works with lists of values; and, how those dynamic queries may be filling up the prepared statement cache in the database (since a new prepared statement would theoretically be created for each unique-length of
IN () value).
With that said, earlier this week, I went to see how many prepared statements were being cached in my local MySQL 5.6 database:
SHOW STATUS LIKE '%prepared%';
... which gave me the following:
mysql> SHOW STATUS LIKE '%prepared%'; +---------------------+-------+ | Variable_name | Value | +---------------------+-------+ | Prepared_stmt_count | 0 | +---------------------+-------+ 1 row in set (0.00 sec)
Ummmmm, wat?! Where the heck are all my prepared statements? 99% of my
CFQuery tags include a
CFQueryParam tag - I was expecting this number to be in the thousands.
When I saw this, I started to go down the rabbit-hole of Google searches, trying to understand why I wasn't seeing any prepared statements showing up in the database. I turned on the MySQL general log to see if I could spot anything fishy:
-- Store the global log output in a table so we can query it. SET GLOBAL log_output = 'TABLE'; -- Enable the global log. SET GLOBAL general_log = 'ON'; -- Clear the log to start fresh. TRUNCATE TABLE mysql.general_log;
Then, I started clicking through the application in order to populate the now-active general log. After a minute or two of this, I went back to the database to see what was showing up:
SELECT l.event_time, l.command_type, l.argument FROM mysql.general_log l WHERE l.argument NOT LIKE '%general_log%' -- Don't return "this" query. ORDER BY l.event_time DESC ;
Which gave me this:
Notice that all of the command_type values are coming back as query. According to the MySQL 5.6 documentation on Prepared Statements, these should be showing up as
Execute commands, not
Querycommands. Well, that is, if they were being run as prepared statements.
At this point, I started to look at how our MySQL JDBC driver was being configured. And, in the comments of the code, I saw this:
// 'max performance' options: // https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html
PUBLIC SERVICE ANNOUNCEMENT: If you put comments in your code, it can actually help the next person who reads it. Take this to heart, all ye who think that comments don't belong in code.
According to the linked resource, here's the configuration for "max performance":
As you can see, it has
cachePrepStmts, which is documented as:
cachePrepStmts- Should the driver cache the parsing stage of PreparedStatements of client-side prepared statements ....
This "client-side prepared statements" caught me off-guard. Are there both client-side and server-side prepared statements? I had always understood prepared statements as being a server-side (database) construct. I had thought the workflow went something like this:
Database driver sends a representation of the query to the database to "prepare" it for parameterization.
Database caches the parsed query.
Database driver sends the "query token" and the parameters to the database.
Database executes the prepared statement with the given parameters.
But, I'm seeing that it's not that simple.
As I was reading through the J-Connector documentation page, I came across this setting:
useServerPrepStmts- Use server-side prepared statements if the server supports them. Default: false.
Notice that the "max performance" configuration omits the
useServerPrepStmts setting, which means that it is defaulting to false, which means that my application isn't using any server-side prepared statements.
To test this new understanding, I created a stand-alone ColdFusion application with the following JDBC connection settings:
useServerPrepStmts=true<== Added this setting!
Then, I created a small
<cfquery name="test"> /* DEBUG: MySQL prepared statement test. */ SELECT * FROM user WHERE id = <cfqueryparam value="1" sqltype="integer" /> ; </cfquery> Get ye to the general_log!
After running this page a few times, I went back to the general log:
SELECT l.event_time, l.command_type, l.argument FROM mysql.general_log l WHERE l.argument NOT LIKE '%general_log%' -- Don't return "this" query. AND l.argument LIKE '%prepared statement test%' ORDER BY l.event_time DESC ;
Which gave me the following:
Even with the
useServerPrepStmts connection setting enabled, this test query is still showing up as a
Eventually, Google brought me back to the MySQL Prepared Statement docs, which contained the phrase:
SQL syntax for prepared statements does not support multi-statements (that is, multiple statements within a single string separated by ; characters).
This makes sense. And, my test query was only running a single statement. However, I am often in the habit of ending my SQL statements with a semi-colon (
;) in the same way that I often end my sentences with a period (
.) in order to denote the "end" of the statement. To see if this "meaningless semi-colon" was causing an issue, I re-ran the test without the semi-colon:
<cfquery name="test"> /* DEBUG: MySQL prepared statement test. */ SELECT * FROM user WHERE id = <cfqueryparam value="1" sqltype="integer" /> -- ; -- Semi-colon commented-out! </cfquery> Get ye to the general_log!
After running this a few times, I went back to the general log and finally saw what I was expecting to see:
Finally! The query is now showing up as a prepared statement in the server. As you can see, there is a
Prepare command_type followed by a series of
Awesome; but, also, wat?! Are you telling me that by including a "meaningless" semi-colon at the end of my SQL queries, I may be accidentally and fundamentally changing the way the SQL query is managed and executed?! That seems completely bananas!
Clearly, my dangling semi-colon (he he he) affects server-side caching of prepared statements; but, I have no idea if the same holds true for client-side caching of prepared statements. But, I supposed that in order to be safe, I should stop ending SQL statements with a semi-colon if I'm not actually running multi-statements in a single request.
These JDBC/J-Connector connection settings now represent what I had always assumed was going on behind the scenes. But, clearly this is not the setting that I've been using for years. And, as we saw above, this is not the setting for "max performance" according to the MySQL documentation.
Since I've always associated the
CFQueryParam and prepared statements with "max performance", I wanted to better understand why using server-side prepared statements isn't the default. To this end, I came across a great article, How does the MySQL JDBC driver handle prepared statements by Vlad Mihalcea. In his article, Vlad mentions:
If the data access logic doesn't cache prepared statements, the extra database round-trip can actually hurt performance. For this purpose, some database systems don't default to server-side prepared statements and execute a client-side statement preparation instead.
Very interesting! This could explain why the J-Connector docs don't include it as a default. Vlad goes on to describe how client-side prepared statements work (which is what I've apparently been using all these years):
When the statement is prepared on the client-side, the bind parameter tokens are replaced with actual parameter values prior to sending the statement to the database server. This way, the driver can use a single request to fetch the result set.
Very interesting! So, even with the
CFQueryParam tags in place, the MySQL database is still receiving the "raw query" - the parameterization and substitution is only happening within the JDBC driver.
Continuing to try and understand how this all fits together, I read through the J-Connector docs and found this setting:
prepStmtCacheSize- If prepared statement caching is enabled, how many prepared statements should be cached? Default: 25
Oh chickens! Are you telling me that of the thousands of unique SQL queries that my application is running on a regular basis, any given connection in the database connection pool is only preparing and caching 25 of them? And the rest are what? Being re-parsed on every single execution?
There's a lot of food-for-thought here! This journey down the prepared-statement rabbit-hole has really shown me how little I've understood about what the database connection pool is doing, how prepared statements work, and who's actually doing the caching in these workflows. Here's what I'm taking away from this:
I still love the
CFQueryParamtag in terms of security. It's ability to prevent SQL-injection attacks help makes ColdFusion one of the more secure web-development languages.
Putting a dangling semi-colon at the end of my SQL statements may prevent caching of the prepared statement. I should probably stop including them unless I'm actually sending multiple statements.
CFQuerytags that contain a
CFQueryParamtag are being compiled as "prepared statements". In fact, only a small fraction of those query statements are being cached (by default) by a given connection to the database.
CFQueryParamtag in an
IN ()clause is probably not having a detrimental affect on the prepared statement cache (in the way that I feared it would) since there's only a small chance that a given query's prepared statement is being cached anyway.
Since the parameterization and value-substitution is happening on the client-side within the driver (using the default, "max performance" settings), there's no need to be concerned with the number of parameter bindings that can be used in a single-query in the MySQL database - because we're not actually using bindings in the query that gets sent to the database server.
Of course, you should take all of this with grain-of-salt because, clearly, I've been very wrong about this stuff in the past - I'm not confident that I'm any less wrong at this point. But, at least I have a slightly better understanding of how prepared statements are parsed, cached, and executed in the MySQL JDBC driver in ColdFusion and Lucee CFML.
And, if you see me getting anything blatantly wrong here, please educate me! I want to understand this stuff better.
Epilogue on the MySQL General Log
As part of this exploration, I turned on the general log, which logs every single query that gets executed. This can be no-so-great for performance. So, once I was done, I went back and turned off the general log:
-- Disable the global log. SET GLOBAL general_log = 'OFF'; -- Clear the log table that we're done. TRUNCATE TABLE mysql.general_log;
Want to use code from this post? Check out the license.
Great post, Ben.
I'm torn between "of course!" and NFW! Blimey, funny what a semi-colon can do, isn't it!
That's interesting about the configuration of the MySQL JDBC driver - where abouts would I start to configure the driver? Where do I put those options?
Right?! It's so strange. Especially when the very existence of a
; is actually preventing caching -- even if there is no content after the semi-colon. As far as where you can set these values, in our app we are setting them in the
this.datasources. But, I assume this can also be in the data-source settings within the ColdFusion administrator, probably in the Connection String box.
Excellent - thanks for the tip. Yeah, the semi-colon thing is definitely real - we got a 2s query down to 200ms, so that's a win, just by removing the semi-colon!
Yoooooo, that's awesome!
Damn. I went through my code & was hoping to find extraneous semicolons. No such luck! Just poorly written queries against a bad database design.