Thought Experiment: Splitting A Single Data Table Up Into High-Writes And High-Reads For Better Performance In MySQL
CAUTION: This post is just me thinking out-loud about an interesting MySQL database performance problem that I have faced at work.
At InVision, we store a date/time stamp that represents the last request that each user makes to the application. We used to use this data-point to indicate "presence" within the application; and now, we use it to better understand the heartbeat of the system. For the sake of this thought experiment, imagine that we have a user
table that has the following columns:
id
(pkey)name
email
lastRequestAt
(date/time)
As you might imagine, in a thick-client, AJAX (Asynchronous JavaScript and JSON) driven application, the user makes a lot of requests to the server. As such, this lastRequestAt
column gets written-to the database a lot. The problem is, this lastRequestAt
column is part of the user
table; which, as you might also guess, gets read-from a lot. In fact, every request during a logged-in Session would cause both a read-from and a write-to the user
table.
For the sake of discussion, image that each request to the application results in the following two SQL queries:
UPDATE
user u
SET
u.lastRequestAt = UTC_TIMESTAMP()
WHERE
u.id = ?
;
SELECT
u.id,
u.name,
u.email,
u.lastRequestAt
FROM
user u
WHERE
u.id = ?
;
In the early stages of the product, this wasn't a problem. We had a relatively small set of users and a reasonable amount of traffic. Reading-from and writing-to the user
table was a non-issue. But, as the size of the database grew, and the traffic increased, we started to see a lot of row-level locking issues, specifically around UPDATE
and SELECT
calls to the user
table.
Being in "start-up mode", we solved this problem with the lowest-level of effort possible (so that we could get back to Feature Development). To that effect, we simply lowered the fidelity of the date/time stamp. Instead of logging it on every request, we only logged it when the last recorded value was more than 60-seconds old.
Then, we changed it to 5-minutes old.
Then, we changed it to 10-minutes old.
We kept changing it because each change would reduce - but not eliminate - the row-level locking and latency problems on the user
table.
Eventually, we started dual-writing the lastRequestAt
value to both the MySQL database and to the Session object, which we store in Redis. And, since Redis is insanely fast, and doesn't have to maintain database indexes, the Redis value helped to minimize the number of concurrent Reads and Writes to the user
table.
Last week, in our Architecture Office Hours meeting, I brought this topic to the group and asked them what kind of data-access pattern we might create if we could go back and rebuild the way that we store this lastRequestAt
column. Talking with Iccha Sethi, one of our Principal Engineers, I think we came up with some interesting ideas.
First, we could split the user
table up into high-read and high-write columns. So, instead of having the lastRequestAt
column live in the user
table, we could elevate the column to have its own domain concept: perhaps an additional table like, user_presence
. This would leave us with two tables:
user
- high-volume readsid
(pkey)name
email
user_presence
- high-volume reads and writesuserID
(pkey)lastRequestAt
This split would fix the user
table, leaving it as a high-read-only table; but, we may still end up with the same locking issue on the new user_presence
table. After all, we only moved the column - we didn't fundamentally change anything about the request architecture.
In addition to splitting the user
table up into two tables, we could also use Redis to buffer some of that data churn. In much the same way as we are doing now, we could store the current lastRequestAt
value in Redis such that Redis would act as a write-through cache. This way, every request to the application would read-from the Redis store, but write-to the MySQL table (as well as the Redis store - the way a write-through cache works).
Pseudo-code for this workflow might look like this:
<cfscript>
// Read the session from Redis.
userSession = redis.hgetall( "sessions:#userID#:#sessionToken#" );
// Optimistically update the in-memory value for presence.
userSession.lastRequestAt = now();
// ASYNCHRONOUSLY log the request timestamp to both Redis and MySQL.
runAsync(
() => {
// Update the write-through cache.
redis.hset(
"sessions:#userID#:#sessionToken#",
"lastRequestAt",
userSession.lastRequestAt
);
// Update the underlying database record.
userPresenceGateway.update({
userID: userSession.id,
lastRequestAt: userSession.lastRequestAt
});
}
);
</cfscript>
With this architecture, the new user_presence
table becomes much more of a write-heavy table. The current user only needs to read-from it when the Session is initialized. And, other users only need to read-from it when they are rendering a list of users along with their respective presence states (which, in our application, is not nearly as often).
Of course, if the user_presence
table starts to become its own performance problem, we could always put a write-through cache in front of the entire table. Meaning, we could cache the entire user_presence
table, which only has two-columns per user, in Redis. Then, read from Redis as needed instead of ever reading from MySQL (except when populating the cache).
ASIDE: Caching an entire table could be tricky. How does that cache get populated in a horizontally scaled application? Are their TTL (Time To Live) expirations on the keys? Could you populate the cache just-in-time as different
userID
values are requested?
Or, maybe we could even store the user_presence
data entirely within its own Redis instance; and then, configure that Redis instance to persist to disk. This way, we would completely isolate the data for "user presence" within its own incredibly fast RAM-based key-value store (eliminating the user_presence
table from MySQL altogether).
I suspect that each one of these steps becomes progressively more expensive from both a logic and a dollars-and-cents point-of-view.
Right now, some of this stuff is just theory (for me). But, taken as a whole, what it represents is a shift in how I think about Redis. As performance becomes a problem, Redis isn't just a "nice to have" way of improving performance - it becomes a fundamental part of the Application Architecture. And, to be honest, I'm not used to thinking in a multi-persistence model. I was raised on relational databases; and, even thinking about storing specific-types of data in different types of specialized databases is not yet something I have a good instinct for.
Another general pattern that I am seeing here is the elevation of "domain concepts". What I mean by that is the promotion of an "entity property" into a full-on "entity". In this case, the "User" property lastRequestAt
is being promoted to the domain-level concept of "Presence".
The more complex my applications become, the more I find myself doing - or wishing I had done - this (elevating entity properties). Creating a domain-level concept makes it easier to think about how data can be consumed and evolved within an application.
And, of course, if anyone has any suggestions here on how they've solved similar problems, I would love to hear about it. There's definitely no "one right answer" here. Different contexts demand different solutions. The trickiest part is sometimes just knowing when you have to stop and modify your current approach.
Want to use code from this post? Check out the license.
Reader Comments
@All,
It might be time for me to flip-through Designing Data-Intensive Applications by Martin Kleppmann. I read this book a few years ago:
www.bennadel.com/blog/3344-designing-data-intensive-applications-the-big-ideas-behind-reliable-scalable-and-maintainable-systems-by-martin-kleppmann.htm
... and it mostly went way over my head. But, this sounds exactly like one of the topics Martin would have discussed.
I might also want to flip-through High Performance MySQL by Baron Schwartz, Peter Zaitsev, and Vadim Tkachenko:
www.bennadel.com/blog/3168-high-performance-mysql-optimization-backups-and-replication-by-baron-schwartz-peter-zaitsev-and-vadim-tkachenko.htm
This is another book I read a few years ago (most of which went over my head). But, they would also have been likely to cover the topic of high read/write tables.
Hi Ben,
Could you do something like multiple rows for each users presence so say for the table userPresence you had
UserId, Segment, Timestamp
1, 0, '2019-10-11 15:06:00',
2, 15, '2019-10-11 15:06:07',
3, 30, '2019-10-11 15:06:06',
3, 45, '2019-10-11 15:06:01',
Then when you want to read Bens last presence you would do something like SELECT * from userPresence where UserId = 2 ORDER by timestamp LIMIT 1; to get most recent.
The difficult thing being the how to divide up but perhaps if you could do it so consecutive writes were segmented based on time so 1st write went to row with segment 0, second segment 15 something etc where the segments where small enough for consecutive writes to go to different rows.
Not sure how that would perform or lock up when you read, but it would avoid you writing to the same row repeatedly.
@Adam,
oops sorry mistyped meant each userId to be the same number so
1, 0, '2019-10-11 15:06:00',
1, 15, '2019-10-11 15:06:07',
1, 30, '2019-10-11 15:06:06',
1, 45, '2019-10-11 15:06:01',
I wonder if you could solve this by queuing. When you process the queue every minute or so, you remove the duplicates and only write the most recent timestamp for each user. This would probably reduce the number of writes you do significantly if there is a high percentage of duplication.
@Pete,
Having been reading up a bit on Redis lately, the idea of queuing actually popped to mind using their Lists. In their accompanying documentation, they have patterns for "delayed tasks", which could maybe used for the queuing mechanism:
https://redislabs.com/ebook/part-2-core-concepts/chapter-6-application-components-in-redis/6-4-task-queues/6-4-2-delayed-tasks/
The complexity with a queue is that you need an additional worker process to be consuming the queue; so, there is some complexity there. Though, we're already talking about a context where some increased complexity is already needed (hence this entire post).
I feel like if I was better with something like Redis or RabbitMQ or SQS, then such a solution would be easier for me to think about.
@Adam,
It's an interesting idea. But, I think part of the problem with our AJAX / Single-Page Application is that a user may spawn many concurrent requests to the app. So, even with the bucketing, I suspect that you'd still get contention around a single record. Of course, I think having it in its own table would go a long way towards locking. Unfortunately, I don't know enough about database mechanics to know what return on investment is.
The only issue I can think of, with delegating more of the responsibility to Redis, is that if the server goes down, you lose the data from Redis.
The other solution is to store the timestamp, alongside the userid, in a JWT/JWE. The web token is then passed back & forwards from client to server. During the JWT's validation process on the server, you could just update the timestamp. In fact, it is good practice to check the timestamp to see if the JWT has expired.
I generally issue a new JWT, when the user logs in and then instead of using session variables, I pass a custom JSON object inside the custom key of the JWT, which contains all the user related values for that session. Any values that I need to persist across page refreshes, which don't happen too often in an SPA, I store in a client cookie. You can further enhance security by using a JWE. I wrote a ColdFusion library for JWE, which you could use:
https://www.forgebox.io/view/JWTSignEncrypt
Actually, I think I wrote this, off the back of one of your articles on JWT.
Hey Ben,
Great read; it's always interesting to hear about real-world problems in systems at scale.
Rather than fighting against the row locking, could you use dirty reads to avoid the locks?
With a database like MS SQL Server you would use the "WITH NOLOCK" hint on the select. I'm no MySQL specialist, but it looks like the equivalent functionality would be achieved using "READ UNCOMMITTED" at the transaction level:
https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html
Some more info talking this subject for MySQL:
http://itecsoftware.com/with-nolock-table-hint-equivalent-for-mysql
Probably shouldn't have LastUpdatedAt field in the User table, or really any field that changes a lot as an attribute in a object table. Maybe better to have a log table and stitch together with a view. Some types of these tables, like a User table, can have insert/update/delete triggers on it for auditing (say into the UserAudit table).
@Sean,
Definitely tending to agree with that these days. A learning experience for me. I tend to think primarily about read-access and what indexes I need to create. I am only now just starting to think more holistically about write access patterns.
@Glenn,
That's a great question. I vaugely remember looking into the
NOLOCK
hint back in the day when I was on SQL SERVER. But, I don't remember much about it. I think it was all theory to me at that point. I also don't have enough insight to know how this would affect performance. I'll keep that in the back of my mind in case I find an easy way to A/B test something like this in production (where I wouldn't cause any harm).@Charles,
I'm fascinated with JTW; but, I am still very much a n00b when it comes to pragmatic applications. Right now, our systems half-run on JWT and half-run on old-school Session IDs that are mapped to persisted session data. So, to your point, if Redis goes down, "the system" goes down :D
I have a small book on JWT queued-up in my iBooks; but, I just haven't gotten around to reading it yet.
Wait a second. You can't be that much of a noob:) You wrote an inspiring article on JWT, with an accompanying JWT library. I then extended this to create a JWE library!
Maybe, you forgot about it! It's easy to do!
You can use Kafka to get the queue of the user activity. And then read from there by blocks the latest or all states of user actions
@Alexander,
I know of Kafka, but I haven't used it myself yet. Plus, it would be an entirely other piece of infrastructure to have to setup and maintain. I do like the idea of "events as the source of truth" in theory; but, I've never actually put it into practice. That said, from things I have read, people are still usually "materializing" the event-stream into some sort of a database table so that it can quickly be read. In that case, I am wondering if we would still run into the same problem of having to scale the writes into that table.
hi,
if you have a lot of users/clients and they generate very high traffic then I think you should split the table into more databases. Every database should be on a separate hardware. You should guess a rule how to separate by CLIENT_ID. for example: if the CLIENT_ID is an even number then store into DATABASE-A else DATABASE-B.
If it is not enough, then you can split the table into 10 parts (10 separate databases) if there are more million clients with more than 1000 request/second. Separate CLIENT_IDs by last digit of the ID.
xxxxx0 -> Database-0
xxxxx1 -> Database-1
For reading we use mssql with readonly-replica async mode. I am not familiar with mysql but I think it has some similary solution.
best regards
@All,
On a slightly related note, I just posted a blog entry with some feelings about join tables vs. entity tables:
www.bennadel.com/blog/4267-sql-join-tables-are-just-entity-tables-with-hard-to-name-concepts.htm
While it's not a hill I am prepared to die on, I am going to try and see if I can promote all join tables to be more like entity tables. I think the hardest part of this will be coming up with names for the concept that the join table represents. We'll see - it's a fun thought experiment, if nothing else.
I agree with Alex that Kafka solves this problem at scale. Specifically, you're "enqueuing writes" which is a requirement at high scale. If the writes had to be eventually consistent and fast, you'd enqueue the write to kafka with Log Compaction enabled. You'd then have competing consumers, one set for updating Redis, the other set for updating your (separate) mysql table (which might even reside on separate mysql "hardware" as a microservice).
Log Compaction means each user would have only a single Kafka row. If you lost Redis you could just replay the log starting your consumers at the beginning. Your MySQL consumers would eventually bring the read or write table up to date, but only for your convenience for querying elsewhere.
I'm making lots of assumptions around tolerance, but this would be very, very fast and also durable.
Depending on why you return last access, i also endorse the solution. to change your transaction isolation level on your writes to "Read Committed" and on your reads to Read Uncommitted.
@Bryan,
My big hesitation with Kafka is that, as a technology, it's not something you can lightly "sprinkle" into an existing application. From what I've seen, pulling in Kafka usually means that you are fundamentally changing the way the application is architected and wired together. I'm not saying that this is a "Bad Thing" - only that it adds a large barrier to entry. One of these days, I'll play around with it.