Skip to main content
Ben Nadel
On User Experience (UX) Design, JavaScript, ColdFusion, Node.js, Life, and Love.

Why I Use TINYINT Columns Instead Of BIT Columns For Boolean Data In A MySQL Application

By Ben Nadel on
Tags: ColdFusion, SQL

Since the theme of my week appears to be database interactions (having looked at index structure query-performance on "membership" tables and prepared statements in the JDBC driver), I figured I would end the week on more database-related "hot take": I no longer use BIT columns to store "Boolean" data in MySQL. Instead, I use a TINYINT column.

CAUTION: There is no "right" answer on this topic - this is just my opinion based on my experience and the choices that have come back to haunt me / make my database interactions more challenging.

Semantically speaking, a BIT field is no more clear or meaningful than a TINYINT field when it comes to storing Boolean data. Because, at least in MySQL, a BIT field isn't a true-false data type - it's a binary data type. A BIT field contains "bits": N-number of bits, between 1 and 64. The only reason that it can represent a true-false value within an application is because the developers working on that application have all agreed to treat it as such.

A TINYINT field is also not a true-false data type - it's a numeric data type. And, like the BIT field, the only reason that it can represent a true-false value within an application is because the developers working on that application have all agreed to treat it as such.

So, semantically speaking, neither BIT nor TINYINT are "Booleans"; but, they can both be overloaded to represent a Boolean value if, and only if, the developers working on that application all agree on said social contract.

Now, storage-wise, there's also no difference between a BIT(1) and a TINYINT because a BIT field is actually stored in bytes. Meaning, in order to store a single bit, you have to store 8-bits. Which is also the amount of storage required by a TINYINT.

So, from a physical stand-point, you're not "saving space" by using a BIT instead of a TINYINT when attempting to represent a Boolean value.

To recap:

  • Neither BIT nor TINYINT is a "Boolean" value. We can only treat them as Boolean values when our developers agree to treat them as Boolean values.

  • Both BIT(1) and TINYINT require a byte of storage.

Since there's no semantically meaningful difference between the two data-types when it comes to representing Boolean values, neither data-type is "more correct". But, there are downsides to using a binary field in order to represent a true-false value:

  • The MySQL driver for Node.js returns BIT data as a Buffer; because, of course it would. After all, a BIT column represents binary data, which is what the Buffer represents in Node.js. As such, in order to translate a BIT(1) result into a true / false data-type, you have add special type-casing logic to your database client configuration.

  • The MySQL driver for Java returns binary data when BIT is used in a COALESCE() call. As such, if you are performing a LEFT OUTER JOIN on a BIT field and attempt to provide a default value with COALESCE(), you have to CAST() the resultant value back to an UNSIGNED type in order to use the value as a Truthy within your application code.

The reason I have articles written on the above two issues is because I've personally run into these two issues and wasted hours trying to understand the issue and come up with a work-around.

On the other hand, I have never personally run into an issue when using a TINYINT column to represent a "Boolean" value.

With all that said, I will conceit that MySQL will allow you to store 2 in a TINYINT field; but, will prevent you from storing 2 in a BIT(1) field because a 2 would require more bits than are defined on the column-type. As such, the BIT(1) type does place more constraints on the stored value.

That said, in my ColdFusion applications, if I'm persisting a true / false value, I'm generally not using numeric types in the code - I'm using Boolean types that get persisted as TINYINT types at write-time (using CFQueryParam):

public void function setIsFavorite(
	required numeric movieID,
	required boolean isFavorite // THIS IS A BOOLEAN ABSTRACTION!
	) {

	```
	<cfquery>
		UPDATE
			movie
		SET
			isFavorite = <cfqueryparam value="#isFavorite#" sqltype="tinyint" />
		WHERE
			id = <cfqueryparam value="#movieID#" sqltype="integer" />
	</cfquery>
	```

}

Of course, being that ColdFusion is a very dynamic, flexible language, you can pass a numeric value in for a Boolean type. As such, one could easily pass-in a value such as 2 for the isFavorite argument above and ColdFusion would:

  1. Not complain that it's a numeric value.

  2. Persist a 2 into the isFavorite field.

To fix this issue, you can normalize the isFavorite boolean at write time in a variety of ways such as the double-bang operator, or the booleanFormat(), yesNoFormat(), or trueFalseFormat() functions:

public void function setIsFavorite(
	required numeric movieID,
	required boolean isFavorite
	) {

	```
	<cfquery>
		UPDATE
			movie
		SET
			-- Normalize using Double-Bang operator:
			isFavorite = <cfqueryparam value="#( !! isFavorite )#" sqltype="tinyint" />,

			-- Or, booleanFormat:
			isFavorite = <cfqueryparam value="#booleanFormat( isFavorite )#" sqltype="tinyint" />,

			-- Or, yesNo:
			isFavorite = <cfqueryparam value="#yesNoFormat( isFavorite )#" sqltype="tinyint" />,

			-- Or, trueFalse
			isFavorite = <cfqueryparam value="#trueFalseFormat( isFavorite )#" sqltype="tinyint" />
		WHERE
			id = <cfqueryparam value="#movieID#" sqltype="integer" />
	</cfquery>
	```

}

In the end, neither the BIT type nor the TINYINT type are a prefect choice when representing a Boolean value within a MySQL database. They are both semantic overloads that take up the same amount of storage space. In the end, the choice is personal; and for me, the TINYINT makes the most sense since a TINYINT never causes issues in my SQL queries (in the way that BIT has and does). And for me and my style of ColdFusion development, that's the right trade-off.

Your mileage may vary.

Epilogue On The MySQL JDBC Driver

In writing this article, I discovered that the MySQL JDBC Driver has a few Boolean-related properties that can be set:

  • tinyInt1isBit: Should the driver treat the datatype TINYINT(1) as the BIT type (because the server silently converts BIT -> TINYINT(1) when creating tables)? Default: true.

    When I was experimenting with this, what I noticed is that the value comes into the ColdFusion application as a 0 or 1; however, the value may still be persisted as a larger integer in the data-table.

  • transformedBitIsBoolean: If the driver converts TINYINT(1) to a different type, should it use BOOLEAN instead of BIT for future compatibility with MySQL-5.0, as MySQL-5.0 has a BIT type? Default: false.

I had never seen these before, so I have no feelings about them.



Reader Comments

Very interesting. I have always used TINYINT, but, not because I am smart. It's just the way I have always done it. I guess it's maybe because the word bit frightens me!

What I don't really understand, is why you would want to do:

booleanFormat( isFavorite )

Wouldn't this just return true or false, when in fact, you want to submit either a 0 or 1

And wouldn't this throw an exception? Because you are trying to submit true/false into a TINYINT field?

Reply to this Comment

@Charles,

Good question - the CFQueryParam tag is actually converting the strings "true" and "false" into the numeric equivalent for the TINYINT parameter type. In this case "true" gets converted to "1" and "false" gets converted to "0". The conversions are only there to protect against someone trying sneak a number through as one of the boolean values. The reality is, nothing would really be "damaged" by having a value like 2 in the database. The downside would be that a SQL query that only ever checks for 0 or 1 will just never match against that record.

Reply to this Comment

You should have an asterisk next to 1 bit uses 1 byte of storage. It is my understanding that every set of 8 bit-fields uses the same 1 byte in a row of data. I know this doesn't sound like a big difference but I guess there are databases where this could really add up.

cnt bits tinyint
1 1 byte 1 byte
4 1 byte 4 bytes
8 1 byte 8 bytes
9 2 bytes 9 bytes
15 2 bytes 15 bytes
32 4 bytes 32 bytes

I only realized this because I had the same thought awhile ago. Sometimes rules change where a column may start out as a yes/no but could turn into a type or how many. I looked into why not just make them all tinyint from the start. Well that is why.

Reply to this Comment

Post A Comment

You — Get Out Of My Dreams, Get Into My Blog
Live in the Now
Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.