Generally speaking, SQL is pretty lenient when it comes to simple types and value comparisons. Meaning, from a SQL execution standpoint,
"1" are equal because the SQL engine will coerce the values as needed (much like ColdFusion). However, document databases like MongoDB are much less lenient and will not cast values on-the-fly. This got me thinking about the new
JSON support in MySQL 5.7: will simple values get coerced when comparing an input to a
CAUTION: While MySQL will happily coerce values on-the-fly during a comparison, note that this does have implications. The most critical of which is that type coercion will bypass index selection during query planning which may result in a full-table scan.
When it comes to the
JSON data structures in MySQL 5.7, you can "extract" a value using
->); and then, you can "unquote" an extracted value using
->>). The latter of these two methods will remove the quotes (
"") from an extracted String, which may or may not change the actual data-type of the resolved value. Before digging into value comparisons, it wasn't clear to me how all of this would come together.
To test, we can create an in-memory
JSON value using
JSON_OBJECT(). Then, we can just run a number of comparisons against String and Number values:
SET @json = JSON_OBJECT( 'string', '1', 'number', 1 ); SELECT -- Control comparisons across types. ( 1 = 1 AND 1 = '1' AND '1' = '1' ), -- Test JSON String against various inputs. JSON_EXTRACT( @json, '$.string' ) = '1', JSON_EXTRACT( @json, '$.string' ) = 1, JSON_UNQUOTE( JSON_EXTRACT( @json, '$.string' ) ) = '1', JSON_UNQUOTE( JSON_EXTRACT( @json, '$.string' ) ) = 1, -- Test JSON Number against various inputs. JSON_EXTRACT( @json, '$.number' ) = '1', JSON_EXTRACT( @json, '$.number' ) = 1, JSON_UNQUOTE( JSON_EXTRACT( @json, '$.number' ) ) = '1', JSON_UNQUOTE( JSON_EXTRACT( @json, '$.number' ) ) = 1 ;
As you can see, we have a
JSON payload that contains two keys: one that points to a "numeric"
1 and one that points to a "string"
"1". Then, we proceed to compare those two values to both string and numeric inputs. And, when we run this SQL query in MySQL 5.7.32, we get the following output:
As you can see, everything came back as
TRUE except these two comparisons:
JSON_EXTRACT( @json, '$.string' ) = 1
JSON_EXTRACT( @json, '$.number' ) = '1'
When comparing an input to a
JSON value, MySQL does not coerce the operands. Meaning, in the context of a
"1" are not the equal. However, what we can see is that when we "unwrap" the
JSON value using
JSON_UNQUOTE(), MySQL will fall-back to coercing strings and numbers on-the-fly.
JSON Columns: Garbage In, Garbage Out
One of the splendiferous features of a relational database table is that is enforces a schema at the storage level. Which means, if you have an
INT column, your input is going to be stored as a number regardless of how you provide said value (
Which means, garbage in is garbage out. In other words, it's up to you and your application layer to make sure all inputs to a
JSON structure are consistent. Any mess-up and it means you're not going to get the expected query results. To see what I mean, let's look at this SQL which accidentally commingles a String value amongst a bunch of Numbers:
SELECT COUNT( * ) FROM ( SELECT JSON_OBJECT( 'value', 1 ) AS payload UNION ALL SELECT JSON_OBJECT( 'value', 1 ) AS payload UNION ALL SELECT JSON_OBJECT( 'value', '1' ) AS payload UNION ALL -- OOPS!!!! SELECT JSON_OBJECT( 'value', 1 ) AS payload ) AS derived WHERE derived.payload->'$.value' = 1 ;
As you can see, we have 4 rows that contain a
.value property. However, when we go to query for the rows in which the
.value property is
1, we get the following
We're missing a row - the one in which someone accidentally stored the
.value property as a String.
ASIDE: As we saw in the previous section, MySQL will fall-back to coercing values on-the-fly if we "unwrap" the
JSONvalues. Which means, we'll get back all 4 rows if we change
payload->>'$.value', which is a short-hand syntax for the
With great flexibility comes great responsibility. When you opt-out of the strict schema provided by a relational database, it means that the onus of enforcement falls upon the developer. You now become the one responsible for data integrity. Proceed with caution!
Want to use code from this post? Check out the license.