Skip to main content
Ben Nadel at CFUNITED 2008 (Washington, D.C.) with: Hal Helms
Ben Nadel at CFUNITED 2008 (Washington, D.C.) with: Hal Helms@halhelms )

Why I Use TINYINT Columns Instead Of BIT Columns For Boolean Data In A MySQL Application

By on
Tags: ,

Since the theme of my week appears to be database interactions (having looked at index structure query-performance on "membership" tables and prepared statements in the JDBC driver), I figured I would end the week on more database-related "hot take": I no longer use BIT columns to store "Boolean" data in MySQL. Instead, I use a TINYINT column.

CAUTION: There is no "right" answer on this topic - this is just my opinion based on my experience and the choices that have come back to haunt me / make my database interactions more challenging.

Semantically speaking, a BIT field is no more clear or meaningful than a TINYINT field when it comes to storing Boolean data. Because, at least in MySQL, a BIT field isn't a true-false data type - it's a binary data type. A BIT field contains "bits": N-number of bits, between 1 and 64. The only reason that it can represent a true-false value within an application is because the developers working on that application have all agreed to treat it as such.

A TINYINT field is also not a true-false data type - it's a numeric data type. And, like the BIT field, the only reason that it can represent a true-false value within an application is because the developers working on that application have all agreed to treat it as such.

So, semantically speaking, neither BIT nor TINYINT are "Booleans"; but, they can both be overloaded to represent a Boolean value if, and only if, the developers working on that application all agree on said social contract.

Now, storage-wise, there's also no difference between a BIT(1) and a TINYINT because a BIT field is actually stored in bytes. Meaning, in order to store a single bit, you have to store 8-bits. Which is also the amount of storage required by a TINYINT.

So, from a physical stand-point, you're not "saving space" by using a BIT instead of a TINYINT when attempting to represent a Boolean value.

To recap:

  • Neither BIT nor TINYINT is a "Boolean" value. We can only treat them as Boolean values when our developers agree to treat them as Boolean values.

  • Both BIT(1) and TINYINT require a byte of storage.

Since there's no semantically meaningful difference between the two data-types when it comes to representing Boolean values, neither data-type is "more correct". But, there are downsides to using a binary field in order to represent a true-false value:

  • The MySQL driver for Node.js returns BIT data as a Buffer; because, of course it would. After all, a BIT column represents binary data, which is what the Buffer represents in Node.js. As such, in order to translate a BIT(1) result into a true / false data-type, you have add special type-casing logic to your database client configuration.

  • The MySQL driver for Java returns binary data when BIT is used in a COALESCE() call. As such, if you are performing a LEFT OUTER JOIN on a BIT field and attempt to provide a default value with COALESCE(), you have to CAST() the resultant value back to an UNSIGNED type in order to use the value as a Truthy within your application code.

The reason I have articles written on the above two issues is because I've personally run into these two issues and wasted hours trying to understand the issue and come up with a work-around.

On the other hand, I have never personally run into an issue when using a TINYINT column to represent a "Boolean" value.

With all that said, I will conceit that MySQL will allow you to store 2 in a TINYINT field; but, will prevent you from storing 2 in a BIT(1) field because a 2 would require more bits than are defined on the column-type. As such, the BIT(1) type does place more constraints on the stored value.

That said, in my ColdFusion applications, if I'm persisting a true / false value, I'm generally not using numeric types in the code - I'm using Boolean types that get persisted as TINYINT types at write-time (using CFQueryParam):

public void function setIsFavorite(
	required numeric movieID,
	required boolean isFavorite // THIS IS A BOOLEAN ABSTRACTION!
	) {

	```
	<cfquery>
		UPDATE
			movie
		SET
			isFavorite = <cfqueryparam value="#isFavorite#" sqltype="tinyint" />
		WHERE
			id = <cfqueryparam value="#movieID#" sqltype="integer" />
	</cfquery>
	```

}

Of course, being that ColdFusion is a very dynamic, flexible language, you can pass a numeric value in for a Boolean type. As such, one could easily pass-in a value such as 2 for the isFavorite argument above and ColdFusion would:

  1. Not complain that it's a numeric value.

  2. Persist a 2 into the isFavorite field.

To fix this issue, you can normalize the isFavorite boolean at write time in a variety of ways such as the double-bang operator, or the booleanFormat(), yesNoFormat(), or trueFalseFormat() functions:

public void function setIsFavorite(
	required numeric movieID,
	required boolean isFavorite
	) {

	```
	<cfquery>
		UPDATE
			movie
		SET
			-- Normalize using Double-Bang operator:
			isFavorite = <cfqueryparam value="#( !! isFavorite )#" sqltype="tinyint" />,

			-- Or, booleanFormat:
			isFavorite = <cfqueryparam value="#booleanFormat( isFavorite )#" sqltype="tinyint" />,

			-- Or, yesNo:
			isFavorite = <cfqueryparam value="#yesNoFormat( isFavorite )#" sqltype="tinyint" />,

			-- Or, trueFalse
			isFavorite = <cfqueryparam value="#trueFalseFormat( isFavorite )#" sqltype="tinyint" />
		WHERE
			id = <cfqueryparam value="#movieID#" sqltype="integer" />
	</cfquery>
	```

}

In the end, neither the BIT type nor the TINYINT type are a prefect choice when representing a Boolean value within a MySQL database. They are both semantic overloads that take up the same amount of storage space. In the end, the choice is personal; and for me, the TINYINT makes the most sense since a TINYINT never causes issues in my SQL queries (in the way that BIT has and does). And for me and my style of ColdFusion development, that's the right trade-off.

Your mileage may vary.

Epilogue On The MySQL JDBC Driver

In writing this article, I discovered that the MySQL JDBC Driver has a few Boolean-related properties that can be set:

  • tinyInt1isBit: Should the driver treat the datatype TINYINT(1) as the BIT type (because the server silently converts BIT -> TINYINT(1) when creating tables)? Default: true.

    When I was experimenting with this, what I noticed is that the value comes into the ColdFusion application as a 0 or 1; however, the value may still be persisted as a larger integer in the data-table.

  • transformedBitIsBoolean: If the driver converts TINYINT(1) to a different type, should it use BOOLEAN instead of BIT for future compatibility with MySQL-5.0, as MySQL-5.0 has a BIT type? Default: false.

I had never seen these before, so I have no feelings about them.

Want to use code from this post? Check out the license.

Reader Comments

414 Comments

Very interesting. I have always used TINYINT, but, not because I am smart. It's just the way I have always done it. I guess it's maybe because the word bit frightens me!

What I don't really understand, is why you would want to do:

booleanFormat( isFavorite )

Wouldn't this just return true or false, when in fact, you want to submit either a 0 or 1

And wouldn't this throw an exception? Because you are trying to submit true/false into a TINYINT field?

15,331 Comments

@Charles,

Good question - the CFQueryParam tag is actually converting the strings "true" and "false" into the numeric equivalent for the TINYINT parameter type. In this case "true" gets converted to "1" and "false" gets converted to "0". The conversions are only there to protect against someone trying sneak a number through as one of the boolean values. The reality is, nothing would really be "damaged" by having a value like 2 in the database. The downside would be that a SQL query that only ever checks for 0 or 1 will just never match against that record.

1 Comments

You should have an asterisk next to 1 bit uses 1 byte of storage. It is my understanding that every set of 8 bit-fields uses the same 1 byte in a row of data. I know this doesn't sound like a big difference but I guess there are databases where this could really add up.

cnt bits tinyint
1 1 byte 1 byte
4 1 byte 4 bytes
8 1 byte 8 bytes
9 2 bytes 9 bytes
15 2 bytes 15 bytes
32 4 bytes 32 bytes

I only realized this because I had the same thought awhile ago. Sometimes rules change where a column may start out as a yes/no but could turn into a type or how many. I looked into why not just make them all tinyint from the start. Well that is why.

1 Comments

Is there a blog explaining why should we even use tinyint/bit over boolean? Early in career i learnt that has performance benefits. I'm yet to find a concrete explanation on how it is better. Another thing , as you have mentioned multiple times about using other options to represent boolean works because "the developers working on that application have all agreed to treat it as such" . Isn't it then difficult from maintenance point of view?
Appreciate your help on the topic.

15,331 Comments

@Jivan,

Thank you for providing more appropriate links 💪 I don't think I've ever heard of NBD storage - I'll have to look that up.

1 Comments

Bits use less storage than tinyint (significantly less, as the number of bit columns increases). Bits process faster than tinyints. A programmer who uses tinyint is signaling to the next programmer, and the next programmer will assume, the data is meant to contain values greater than 1. If 1s and 0s only represent true and false, and are not actually true and false, then 1 more accurately represents true and 0 false that reversing them would not make sense. However, 1s and 0s do not represent true and false. 1 and 0 are true and false, as math itself is an abstraction for logic. If it were not the case, Karnaugh maps would not work, and math itself would fail. To suggest their interpretation could be reversed is to suggest 5 could represent 6. At that point, further discussion is not helpful.

15,331 Comments

@Johnny,

At least in MySQL, I don't believe you are correct about the storage requirements. From what I am seeing in the documentation for InnoDB storage, bits are stored as multiples of bytes. So, a TINYINT and BIT(1) both use a single byte.

As far as "bit are processed faster", I am not sure I understand what that means. What processing are you talking about? In this post, I am specifically talking about using TINYINT to represent True and False values, so I am not doing any processing on the value itself (other than casting it to Truthy/Falsey -- but that is happening in the application code, not in the database code).

And, as far as signaling intent to the next developer, I would only offer up that this has not been an issue for the teams that I work on. Normally, when an engineer adds a datatable, they also add the data-access layer that consumes that datatable. The data access layer deals in Boolean values; and, only casts to TINYINT when writing to the database. As such, signaling from the code is usually very clear. Not to mention that the database column names choose a truthy-style semantic, like isActive, isArchived, isApproved, etc. As such, to try and put a 5 into an isActive column wouldn't make any sense.

I definitely see your points; but, at least in my experience, they feel more grounded in theory than in practice. But, again, your mileage may vary.

Post A Comment — I'd Love To Hear From You!

Oops!
NEW: Some basic markdown formatting is now supported: bold, italic, blockquotes, lists, fenced code-blocks. Read more about markdown syntax »
Comment Etiquette: Please do not post spam. Please keep the comments on-topic. Please do not post unrelated questions or large chunks of code. And, above all, please be nice to each other - we're trying to have a good conversation here.