The other day, I asked my managed hosting provider to upgrade my MySQL version from 5.7.10 to 5.7.37 - the most recent general availability release of the 5.7 edition. I didn't want to go any farther than that because I didn't want to test the update - I just wanted access to the JSON functions and column-type. There was a little miscommunication, however, and the Support team ended up putting me on the latest version of MySQL, 8.0.28. This caused a few hours of accidental downtime (due to an incompatibility with the datasource connection-string). But, once I hotfixed that, the site appeared to be running smoothly. Once the dust settled, I took this an opportunity to cleanup a bunch of database schema and connectivity issues.
Cleaning Up My Datasource Connection String
After they upgraded me to MySQL 8.0.28, my site went down for two reasons:
I had two MySQL drivers (Java JAR files) installed, and I think ColdFusion was getting confused about which driver to actually use. I'm not entirely sure this is the case; however, when I removed the 5.x JAR file, leaving me with just the 8.0.22 JAR file, the connection error changed.
Once I removed the old MySQL driver, the error message changed to include an issue with the timezone, or lack thereof:
Connection verification failed for data source: bennadel
java.sql.SQLException: The server time zone value 'Coordinated Universal Time' is unrecognized or represents more than one time zone. You must configure either the server or JDBC driver (via the 'serverTimezone' configuration property) to use a more specific time zone value if you want to utilize time zone support.
To hotfix this issue, I went into the datasource and added the following to my connection string:
And poof, the site started working again!
In my experience, the database connection string is one of those parts of the application that tends to rot over time. As different versions of the database engine come-and-go, aspects of the connection string become irrelevant. And, since few of us "application developers" are also "database experts", it's challenging to look at the connection string and have a sense of what shouldn't be there; and, perhaps more importantly, what is missing.
To start cleaning up my MySQL connection string, I went to the MySQL Connector/J 8.0 Developer Guide - Changes in Connection Properties and looked to see if anything I currently had was either removed or deprecated. I found three relevant properties that I currently had in my connection string that should not be there:
useUnicode=true(removed) - while not technically in this document, I found that
useUnicodeis no longer even listed as a property in the Connector/J connection options.
useLegacyDatetimeCode=false(removed) - removed in 8.0 version.
useDynamicCharsetInfo=false(removed) - removed in 8.0 version.
After looking at the Connector/J changes, I then went to the Connector/J 8.0 configuration properties and looked through the list to see what was available. I ended up including the following properties:
allowMultiQueries=true- I've long used this feature in my ColdFusion applications to allow me to execute more than one statement inside a single
CFQuerytag. While some consider this to be a "Security issue", I find it to be extremely helpful.
characterEncoding=UTF-8- In order to get the ColdFusion datasource connection to work with emoji and other Astral Plane Unicode characters, I had to tell it to specifically use a UTF-8 charset.
serverTimezone=UTC- I don't quite understand why I needed this since the ColdFusion server is running in UTC already. But, apparently this helps the driver figure out how to apply date/time conversions on the data coming out of the database.
tinyInt1isBit=false- I always use
BITin my ColdFusion applications. And, I don't want the MySQL driver trying to magically turn anything into a
BITon my behalf. I am expecting my "Boolean values" to be coming back as
useConfigs=maxPerformance- Because I like the sound of max performance!
For this last one -
useConfigs - I had a hard time finding any documentation on it at all. I finally went into the GitHub repository for the Connector/J Driver and found the
.properties file that gets loaded for
# # A configuration that maximizes performance, while # still staying JDBC-compliant and not doing anything that # would be "dangerous" to run-of-the-mill J2EE applications # # Note that because we're caching things like callable statements # and the server configuration, this bundle isn't appropriate # for use with servers that get config'd dynamically without # restarting the application using this configuration bundle. cachePrepStmts=true cacheCallableStmts=true cacheServerConfiguration=true # # Reduces amount of calls to database to set # session state. "Safe" as long as application uses # Connection methods to set current database, autocommit # and transaction isolation # useLocalSessionState=true elideSetAutoCommits=true alwaysSendSetIsolation=false # Can cause high-GC pressure if timeouts are used on every # query enableQueryTimeouts=false # Bypass connection attribute handling during connection # setup connectionAttributes=none
In the end, here's the value that I have in the Connection String property of my ColdFuision datasource:
I will strongly caveat though that, as stated above, I am not a database expert. As such, take all of this with a grain of salt - this is just what I setup, this is not a recommendation for your setup.
Cleaning Up My Database Schema
A couple of years ago, I updated a few table columns to be
utf8mb4 in order to allow people to post emoji in my blog comments. This was the least amount of work that I could in order to enable that feature. But, with this upgrade to MySQL 8, I wanted to take a moment to actually modernize all of my tables and character sets.
I really only have a vague understanding of how all the character set encodings work, so I turned to Mathias Bynens' article, How to support full Unicode in MySQL databases. Based on what he has there - and some stuff that I read on StackOverlow about collation - I wrote a bunch of
ALTER statements to change:
- The default character set on the database.
- The default character set on the tables.
- Update the character set of any text-based field.
Basically, I changed all the character sets to
utf8mb4 and all the collations to
NOTE: You may notice that I have a mixture of headless-camel-case and snake-case in my table column names. This is just a change in my preferred style over time. I intend to go back and update all the columns to be headless-camel-case eventually.
/* Change the default charset for the database. */ ALTER DATABASE bennadel DEFAULT CHARACTER SET utf8mb4 DEFAULT COLLATE utf8mb4_0900_ai_ci ; /* Change the default charset for all the tables. */ ALTER TABLE blog_comment CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE blog_comment_edit_token CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE blog_entry CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE blog_entry_blog_entry_jn CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE blog_entry_subscription CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE blog_entry_tag_jn CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE member CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE member_approval CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE pending_blog_comment CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; ALTER TABLE tag CHARACTER SET utf8mb4, COLLATE utf8mb4_0900_ai_ci ; /* Update the text columns to have the new charset. */ ALTER TABLE blog_comment MODIFY COLUMN content longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN content_markdown longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL ; ALTER TABLE blog_comment_edit_token MODIFY COLUMN value varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL ; ALTER TABLE blog_entry MODIFY COLUMN name varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN description longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN content longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN content_markdown longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN meta_keywords varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN meta_description text CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL ; ALTER TABLE member MODIFY COLUMN name varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN email varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN url varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL ; ALTER TABLE pending_blog_comment MODIFY COLUMN content longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN contentMarkdown longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN authorName varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN authorEmail varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL, MODIFY COLUMN authorUrl varchar(500) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL ; ALTER TABLE tag MODIFY COLUMN name varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL ; /* Optimize tables with new changes. */ OPTIMIZE TABLE blog_comment; OPTIMIZE TABLE blog_comment_edit_token; OPTIMIZE TABLE blog_entry; OPTIMIZE TABLE member; OPTIMIZE TABLE pending_blog_comment; OPTIMIZE TABLE tag;
As part of the column
ALTER statements, I took the opportunity to remove a bunch of
DEFAULT values. At this point in my career, I kind of prefer not having any defaults so that the code throws an error if I'm missing something in my
INSERT statements. And, if I find that I have code doesn't "do anything", then I'd rather remove that column from the database rather than having a bunch of meaningless defaults.
I also increased a number of
varchar fields to be
255 in length. I'm currently mid-thought on how I feel about
varchar field length - more to come on that in a future post. I was also a little afraid that I would get some sort of accidental value truncation with the new character set - though, that fear may have been unfounded?
Cleaning Up My SQL Statements
Once I had the proper
CHARACTER SET on my database table schemas; and, once I added
characterEncoding=UTF-8 to my datasource connection string; I was finally able to get rid of the
SET NAMES directives in my SQL. I went to the various comment
INSERT statements and removed the following line:
SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci;
So Far, Nothing Has Exploded
Because databases - and the character sets therein - are still kind of magical in my eyes, I don't fully know what to expect from these changes. But, so far, nothing has exploded. I'm not seeing any SQL errors; I'm not seeing any ColdFusion errors; and, emoji characters continue to work in comments. Hopefully I haven't missed anything important!
Since I wasn't expecting to be on MySQL 8, I don't even know what new features are now available to me. So, I'll definitely be digging into that changeset net!
Want to use code from this post? Check out the license.
Eventually, I want to stop using the ColdFusion Administrator to setup the datasources and start using
this.datasources in the
Application.cfc. But, those URLs are a bit more complicated - and I only wanted to attack one thing at a time.
Ok, it took me a few mornings of digging through configuration files; but, I finally figured out how to move my datasource configuration to a per-application setting in ColdFusion 2021:
I was quite surprised at how hard it was to find documentation on this feature. Hopefully I just missed it somewhere and there is some good documentation for it. If not, maybe my post will help others figure it out.
One feature that I'm excited to try is something called a
LATERAL derived table:
Unlike a standard derived table which has to be "constant", a
LATERAL derived table can refer to the outer query row. Which means, getting the
LATERAL records is a context-sensitive operation. Very interesting stuff!
At InVision, we use Mode for running queries on our data warehouse. Lots of those queries, created by "data scientists" use some crazy SQL in them. One aspect of SQL that they use is something called a Common Table Expression (CTE). As of MySQL 8, we can now use CTEs as well. And, after I read up on them, they are actually quite simple but powerful:
I think this is going to go a long way to make complex queries easier to read! And, when they are easier to read, they are easier to maintain 💪
Any reason why you haven't moved to mariadb or percona yet?
The performance gains are enough (massive), and extra functionality is nice lso in the language side.
At this point, moving to a new database would just be a "point of friction" since I'm already on MySQL and it seems to work just fine. I am not sure there would be much improvement in terms of performance, at least not for a blog. When I look in my FusionReactor APM, the database interaction is probably the fastest part of the whole application. Most of my queries execute in a few milliseconds.
I'd be curious to know more about the performance gains you see from switch to the other databases? I have a lot of respect for the Percona team - their DB management tools are amazing and have been a life-saver for my team at work. But, re: performance, are there particular cases where you've seen noticeable improvements?