For the past few weeks, I've been trying to clean-up and normalize my blog content using Markdown. As you can imagine, this is not an easy process; but, I thought it was going quite smoothly. That is, until last night when I noticed that one of my posts was being truncated at 64,000 characters:
64,000 characters is the default "Long Text Buffer" size in Adobe ColdFusion's database drivers (at least the MySQL one). You can insert more than 64K characters under such settings; however, if you try to read that data out of the database, it gets truncated. And, since my data clean-up script was querying for data, cleaning it up, and then re-saving it, it becomes clear that the clean-up was taking place on arbitrarily truncated data (which was then be re-saved over the original data - poo!).
Luckily, I only had a few posts that were longer than 64,000 characters; and, I was able to locate some original copies. So, disaster averted. But, it got me thinking about what Lucee CFML is doing by default.
To test this, I used CommandBox to spin up a fresh Lucee CFML 188.8.131.52 server and a fresh Adobe ColdFusion 2018 (2018.0.09.318650) server. Then, I went into each administrative panel and I created a MySQL datasource with the default settings.
NOTE: I added
allowMultiQueries=trueto the connection string in order to run the compound SQL statements in my test.
Once I had the data-source in place, I create this ColdFusion script (compatible with both Adobe ColdFusion and Lucee CFML). The goal of the test is to insert two values: one smaller than 64,000 characters and one larger than 64,000 characters. And then, to see how data goes into and comes out of the database:
<!--- Let's generate two values - one short one and one LONG ONE that is designed to use up more than 64,000 characters (which is the default truncation length in Adobe ColdFusion's data-source configuration). ---> <cfset shortValue = " Hello world ".repeatString( 10 ) /> <cfset longValue = " Hello world ".repeatString( 70000 ) /> <!--- Let's try to insert some data! ---> <cfquery name="dbInsert" datasource="length_test"> TRUNCATE TABLE length_test ; INSERT INTO length_test SET id = 1, value = <cfqueryparam value="#shortValue#" cfsqltype="cf_sql_longvarchar" /> ; INSERT INTO length_test SET id = 2, value = <cfqueryparam value="#longValue#" cfsqltype="cf_sql_longvarchar" /> ; -- Now, let's see how much data was actually inserted into the database. SELECT id, value, LENGTH( value ) AS valueLength FROM length_test ; </cfquery> <cfoutput> <ul> <li> <strong>Short Value ( #dbInsert.id[ 1 ]# )</strong> <ul> <li>Inserted Length: #numberFormat( dbInsert.valueLength[ 1 ] )#</li> <li>Returned Length: #numberFormat( dbInsert.value[ 1 ].len() )#</li> </ul> </li> <li> <strong>Short Value ( #dbInsert.id[ 2 ]# )</strong> <ul> <li>Inserted Length: #numberFormat( dbInsert.valueLength[ 2 ] )#</li> <li>Returned Length: #numberFormat( dbInsert.value[ 2 ].len() )#</li> </ul> </li> </ul> </cfoutput>
As you can see, the final query attempts to read the raw
value as well as the
LENGTH(value) of the raw value.
Now, if we run this CFML code in Adobe ColdFusion 2018, we get the following output:
As you can see, all 910,000 characters were inserted into the database using the default MySQL driver behavior. However, when we go to read the data out of the database, the MySQL driver is truncating the value at 64,000 characters.
ASIDE: To be clear, I have never understood the point of this settings. It seems to be protecting you from ... yourself? I mean, if your application is the one inserting data into the database, and that doesn't get truncated, what's the point of trying to "protect" you on the way out? Does anyone have any insight here?
Ok, now, if we run the same CFML code in Lucee CFML 184.108.40.206, we get the following output:
Woot woot! The default behavior for the MySQL driver in Lucee CFML appears to "just work" without any arbitrary data truncation! All 910,000 characters that get inserted into the database can also be read out of the database.
On Facebook, Zac Spitzer pointed out to me that the MySQL driver in Lucee CFML has settings for large payload retrieval:
Enable binary large object retrieval (BLOB)
Enable long text retrieval (CLOB)
Though, to be clear, in my demo, these values are turned off by default:
As such, maybe it's possible that the default lengths are just much higher in Lucee CFML? I don't know. I tried looking through Lucee's GitHub repository, but I can't quite figure out where these values come into play. Some possibly related lines of code:
To see if there is some sort of default, I tried to run the Lucee CFML demo with a few more value lengths:
- 9,100,434 - ran into a
max_allowed_packet(4,194,304) MySQL error.
- 4,193,000 - worked just fine.
So, at least for my Lucee CFML server started by CommandBox with the default settings for BLOB and CLOB disabled, I don't seem to have any truncation at all at the MySQL driver level. Who knows - maybe this is actually a Lucee bug :P
Anyway, I am writing this to help me remember that this feature even exists in Adobe ColdFusion's MySQL driver; and, to test (and see) that no such CLOB retrieval limitation appears to exist in the MySQL driver for Lucee CFML 220.127.116.11.