Over the last few weeks, I've been looking into feature enhancements that come with our upgrade to MySQL 5.7 at InVision. The two marquee features that really jumped out at me were the
JSON column type support and the virtual indexed columns. However, one more feature that I wanted to explore today is the
sys schema. The
sys schema is a set of Views (and other database objects) that provide more "human accessible" insights into the data collected by the MySQL Performance Schema. These insights might help me find ways to optimize SQL queries and apply better indexes to my application database tables.
Two Sets of Views in the
sys schema comes with two sets of Views: those that have the
x$ prefix, and those that do not. The
x$-prefixed tables contain "unformatted" data in terms of latencies (times are reported in picoseconds) and raw SQL. The tables without the
x$-prefix use human-readable latencies, like "1.4 m", and provide SQL query "hints" (that are basically unreadable).
I am founding that the
x$-prefixed tables are the most helpful since I want to see the raw SQL; and, it's easier to
ORDER BY picoseconds. Trying to
ORDER BY human-readable latencies provides undesirable outcomes as it will order the results based on String-values, not numbers. That said, I can still provide human-friendly latencies in the
SELECT columns by using the
sys Schema Data is Based on Runtime Performance
This probably goes without saying; but, the data provided in the
sys schema is based on runtime performance. Which means, you have to be looking at it on your production database in order for it to provide the most value. Attempting to look at the
sys schema on a non-production database (such as a staging database or a read-replica) will only give you insights about that non-production database. Attempting to port that findings onto a different database will be misguided.
ASIDE: The view that reports "redundant indexes" is probably portable, assuming your database schema is the same everywhere.
sys Schema Views of Interest
After looking through the
sys schema, I've collected a few SQL queries that feel like they'll be of the most interest for my level of database mastery. Historically, my tools of choice for database performance insights have been the
processlist - which shows which SQL queries are running right at this moment; and, the slow-query log - which shows which SQL queries have taken the longest to execute. I've never been a developer that understands much about database IOPs, threads, or locks. As such, I didn't really get much value out of the
sys Views that speak to those concerns.
For the following set of SQL queries, assume that something to this effect was being run before each of the queries:
SET @db = 'my_app_database';
sys schema provides insights across all schemas in a given MySQL instance. As such, I had to limit each query to a particular application schema in order to keep the records relevant. Strangely enough, each SQL query uses a different "name" for this constraint:
When it comes to database performance, nothing is as magical as database indexes. Indexes provide a way for us to quickly access records based on row properties. This performance, while critical, comes at a cost, both from a storage standpoint and a processing standpoint. Every time a row is update, the relevant indexes need to be updated and persisted to disk. As such, any unnecessary indexing mean that you're placing unnecessary load on the database.
schema_redundant_indexes view helps us locate indexes that are unnecessary since there are other "dominant" indexes that already contain the same column information. What this means it that the database is eating the cost of maintaining those indexes for no reason.
/** * Find unnecessary indexes. Having to maintain an index is expensive from both a latency * and a storage standpoint. If we can get rid of any redundant indexes, then we can save * space and improve performance by dropping them. */ SELECT -- The unnecessarily indexed columns. t.table_name, t.redundant_index_name, t.redundant_index_columns, -- The index that is already handling the aforementioned columns. t.dominant_index_name, -- The SQL statement you can execute in order to drop the unnecessary index. t.sql_drop_index FROM sys.schema_redundant_indexes t WHERE t.table_schema = @db ;
The clever thing about this view is that it provides a
sql_drop_index column, which is an actual
ALTER statement that shows you how to get rid of the unnecessary index.
Dovetailing with the previous index, the
schema_unused_indexes view shows us which indexes are not being used. Though, again, I have to stress the fact that this is based on production traffic. Which means, in order for this view to be helpful, your database has to have been live for long enough, and receive enough traffic, such that the access patterns are representative of the full-breadth of SQL queries within your web application.
As I stated above, every index comes at a cost in terms of performance and storage space. As such, if we can remove indexes that don't appear to be participating in any SQL queries, we can reduce the load on the database.
/** * Find unused indexes. Having to maintain an index is expensive from both a latency and a * storage standpoint. If we have indexes that we never use, then we can save space and * improve performance by dropping them. * * CAUTION: Unlike "redundant indexes", unused indexes are not deterministic. Meaning, * this set of indexes is based on access patterns of the application. If your application * database has not been running for a sufficient amount of time, then this list may not * be indicative of the full breadth of index utilization. */ SELECT t.object_name, t.index_name FROM sys.schema_unused_indexes t WHERE t.object_schema = @db ;
Based on what I'm seeing in this view, it becomes obvious that index-usage is a secondary indicator of unused tables! Meaning, if a table doesn't get accessed anymore within your application, none of its indexes will be used. As such, this view, while not directly related to table access, can give you insight into which tables might be droppable.
As I mentioned before, every database index comes with a cost of maintenance. The
x$schema_index_statistics view can help shed light on that cost by showing us the total latency (time) that has gone into updating an index whenever a row is inserted, updated, and deleted.
But, this view also shows us how much time is spent reading from a given index. This can show us which indexes are "hot"; which, in turn, helps us understand which queries are hot.
/** * Look at access and update performance for indexes. */ SELECT t.table_name, t.index_name, t.rows_selected, -- Latency in the "x$" tables are reported as picoseconds. This built-in function -- formats the raw time into human-friendly values, like "3.5 s" and "56.07 ms". sys.format_time( t.select_latency ) AS select_latency, t.rows_inserted, sys.format_time( t.insert_latency ) AS insert_latency, t.rows_updated, sys.format_time( t.update_latency ) AS update_latency, t.rows_deleted, sys.format_time( t.delete_latency ) AS delete_latency FROM sys.x$schema_index_statistics t WHERE t.table_schema = @db ORDER BY t.select_latency DESC ;
This data in this view was actually quite surprising for me. Granted, I'm only looking at a staging database (ie, not production); but, the index with the highest total
select latency was not at all what I would expect. And, probably indicates that a table is being accessed with unnecessary volume.
The thing that database indexes provide - the magic that they bring to the table (pun intended!) - is that they prevent full table scans. That is, they allow us to locate a subset of records without having to read every record into memory first. As you can imagine, a full table scan can be very expensive, depending on the size of the table in question.
x$statements_with_full_table_scans view allows us to see which of our SQL queries are performing full table scans. Which means, there's likely some opportunity for improvement, whether it be a new database index or some in-app caching.
/** * Find SQL queries that perform full-table scans. These are queries in which all the rows * in the table need to be read-in during query execution. * * NOTE: For us, this mostly includes queries that involve "look up" tables, like lists of * permission-types or status-types. */ SELECT t.query, t.exec_count, -- Latency in the "x$" tables are reported as picoseconds. This built-in function -- formats the raw time into human-friendly values, like "3.5 s" and "56.07 ms". sys.format_time( t.total_latency ) AS total_latency, t.no_index_used_count, t.no_index_used_pct, t.rows_sent, t.rows_examined, t.rows_sent_avg, t.rows_examined_avg FROM sys.x$statements_with_full_table_scans t WHERE t.db = @db ORDER BY t.no_index_used_pct DESC, t.no_index_used_count DESC ;
According to books and documentation that I've read, the MySQL query optimizer will sometimes choose to perform a full table scan even if an index is available. It will do this if the cost of reading from the table is less expensive than the cost of accessing the index. That said, in my experience, this is almost never true, even in a local development environment with a few records. Your mileage may vary.
That said, this view provides a
no_index_used_pct column, which is the percentage of time the given SQL statements performs a full table scan. What you'll see is that some queries always perform a full table scan whereas other queries only perform a full table scan in rare occasions.
For me, a lot of the SQL queries that show up in this view are those that perform some sort of
JOIN to a "look up" table. For example, a number of the queries
LEFT OUTER JOIN to
company_permission_type, which is a static list of key-value pairs. This table can probably just get cached in memory, moving the "join" into the application code.
x$schema_table_statistics view provides general CRUD (Create, Read, Update, Delete) insights around table access. This includes both the number of records affected as well as the total time it takes to perform the various CRUD operations.
This view can help us locate "hot" tables. And, tables that may benefit from new indexes. It may also give us insight into which tables should be destructured. For example, if a table has a high
update latency, it may mean that the updated data is causing too much row contention and some of the columns may need to be "extracted".
/** * Look at general latency statistics (how much overall time it takes) to perform various * CRUD operations (Create, Read, Update, Delete) on your database tables. */ SELECT t.table_name, -- Latency in the "x$" tables are reported as picoseconds. This built-in function -- formats the raw time into human-friendly values, like "3.5 s" and "56.07 ms". sys.format_time( t.total_latency ) AS total_latency, -- Number of rows read and total latency for read operations. t.rows_fetched, sys.format_time( t.fetch_latency ) AS fetch_latency, -- Number of rows inserted and total latency for insert operations. t.rows_inserted, sys.format_time( t.insert_latency ) AS insert_latency, -- Number of rows updated and total latency for update operations. t.rows_updated, sys.format_time( t.update_latency ) AS update_latency, -- Number of rows deleted and total latency for delete operations. t.rows_deleted, sys.format_time( t.delete_latency ) AS delete_latency FROM sys.x$schema_table_statistics t WHERE t.table_schema = @db ORDER BY t.total_latency DESC ;
Where as the previous view showed us table-level statistics, the
x$statement_analysis view gets into the low-level details about specific SQL queries, including how often they are executed, how long they take, and whether or not they required temp tables (which have a non-trivial performance cost).
/** * Find SQL queries that have the most latency (how much overall time it takes). */ SELECT -- The raw SQL statement being executed. t.query, t.exec_count, -- Latency in the "x$" tables are reported as picoseconds. This built-in function -- formats the raw time into human-friendly values, like "3.5 s" and "56.07 ms". sys.format_time( t.total_latency ) AS total_latency, sys.format_time( t.max_latency ) AS max_latency, sys.format_time( t.avg_latency ) AS avg_latency, sys.format_time( t.lock_latency ) AS lock_latency, t.rows_sent, t.rows_sent_avg, t.rows_examined, -- The total number of internal in-memory temporary tables created by occurrences of -- this query. t.tmp_tables, -- The total number of internal on-disk temporary tables created by occurrences of -- this query. t.tmp_disk_tables, t.rows_sorted FROM sys.x$statement_analysis t WHERE t.db = @db ORDER BY t.total_latency DESC ;
Here's a shocker for me - in our staging database (where I am doing this exploration), this is the SQL query with the highest average latency:
SELECT * FROM user LIMIT 1
Given the fact that this has no
WHERE clause, I can only assume that someone has chosen to include this SQL query in some sort of a health-check. But, here's the question: why the
SELECT *? This is reading-in the whole table row, which seems unnecessary for a health check. It's relatively slow, especially when you consider that something like this:
SELECT 1 FROM user LIMIT 1
... would use a covering index and be much faster.
Of course, if this is for a health check, then why not just use:
This is a valid query and should validate the database connection just as well.
I can already see that this
sys view is going to be super helpful. And, likely lead to a lot of cross-team communication.
When query for rows, the rows come out of the database in an order that is based on the structure of underlying indexes. As such, you often need to add an
ORDER BY clause to your SQL query. But, ordering records in a way that is different from the underlying indexes has a performance cost. The
x$statements_with_sorting view can shed light on that cost.
/** * Find SQL queries that perform sorts. Sorting in the DB can have a negative affect on * performance. Perhaps these queries can push sorting down into the application layer. */ SELECT t.query, t.exec_count, -- Latency in the "x$" tables are reported as picoseconds. This built-in function -- formats the raw time into human-friendly values, like "3.5 s" and "56.07 ms". sys.format_time( t.total_latency ) AS total_latency, t.sorts_using_scans, t.sort_using_range, t.rows_sorted, t.avg_rows_sorted FROM sys.x$statements_with_sorting t WHERE t.db = @db ORDER BY t.total_latency DESC ;
The thing to understand with an
ORDER BY is that you likely have one database server and many more application servers. As such, a super easy way to increase performance of the queries in this view is to move the
ORDER BY clauses into the application layer. In Lucee CFML, this could not be easier. The query returned from the
CFQuery tag has a
.sort() method. Which means that a SQL statement like this:
ORDER BY id ASC, name ASC, email ASC
... can effortlessly be replaced with a ColdFusion statement like this:
.sort( "id, name, email", "asc, asc, asc" )
And, just like that, you've load balanced your
ORDER BY cost across all of your ColdFusion application servers.
NOTE: I am not advocating that all
ORDER BYclauses be moved into the application layer. I'm only saying that if you see some "hot" SQL queries that have a high sort latencies, moving the
ORDER BYinto the application layer is a low-hanging fruit that might squeeze some better performance out of those SQL queries specifically.
When MySQL needs to create an intermediary result-set as part of its processing, it uses temp tables. Sometimes these temp tables are stored in memory; sometimes they are persisted to disk. To be honest, I can never remember all of the rules around when temp tables get created; but, I believe it's usually related to things like
And, I know they have some cost associated with them. The
x$statements_with_temp_tables view can help us see which SQL queries create the most temp tables. Which, in turn, may give us an indication of which queries need optimization.
/** * Find SQL queries that create in-memory or on-disk TEMP tables in order to calculate the * results. Temp tables can put a lot of load on the database. */ SELECT t.query, t.exec_count, -- Latency in the "x$" tables are reported as picoseconds. This built-in function -- formats the raw time into human-friendly values, like "3.5 s" and "56.07 ms". sys.format_time( t.total_latency ) AS total_latency, -- The total number of internal in-memory temporary tables created by occurrences of -- this query. t.memory_tmp_tables, -- The total number of internal on-disk temporary tables created by occurrences of -- this query. t.disk_tmp_tables, t.avg_tmp_tables_per_query, t.tmp_tables_to_disk_pct FROM sys.x$statements_with_temp_tables t WHERE t.db = @db ORDER BY t.total_latency DESC ;
x$statements_with_runtimes_in_95th_percentile view shows us the SQL queries where 95% of all the executions fall under the given latencies.
/** * Find SQL queries with the highest latency in the 95th percentile. This means, that 95% * of the query instances fall under the given latencies. */ SELECT t.query, t.exec_count, -- Latency in the "x$" tables are reported as picoseconds. This built-in function -- formats the raw time into human-friendly values, like "3.5 s" and "56.07 ms". sys.format_time( t.total_latency ) AS total_latency, sys.format_time( t.max_latency ) AS max_latency, sys.format_time( t.avg_latency ) AS avg_latency, t.rows_sent, t.rows_sent_avg, t.rows_examined, t.rows_examined_avg FROM sys.x$statements_with_runtimes_in_95th_percentile t WHERE t.db = @db ORDER BY t.avg_latency DESC ;
I am not great at statistics. I am not sure how this view will dovetail with the
x$statement_analysis view, which also gives us SQL query latency metrics. I supposed the P95 speaks more to how the vast majority of executions perform for a given SQL statement? I'll have to get more familiar with this one.
The last view that I think will be interesting is the
schema_auto_increment_columns view. This one shows us how much of our primary key space we consumed across all of our tables. In other words, it shows us how long until any given table goes Boom (when it runs out of auto-incrementing IDs).
/** * Find the amount of auto-increment "space" has been used. This may can help identify * tables that are running out of available ID values. */ SELECT t.table_name, t.column_name, -- The highest possible ID that can be created with this data-type. t.max_value, -- The last ID created in this table. t.auto_increment, -- The amount of "ID space" that has been used-up. When this hits 100%, things go -- "Boom"! CONCAT( ( t.auto_increment_ratio * 100 ), '%' ) FROM sys.schema_auto_increment_columns t WHERE t.table_schema = @db ;
Another interesting aspect of this view is that it shows us which tables used signed vs unsigned
INT values. Spoiler alert: they should all be unsigned. Use of any signed
INT values was a mistake; and, cuts-down significantly on the number of IDs that are available... until Boom!
This is Really Exciting!
As I mentioned at the onset, I've historically used the
processlist and the slow-query log to find poorly performing SQL queries. But, this
sys schema is really just gonna blow those techniques out of the water! Even running this on our staging database, I can already spot some issue - I'm feeling confident that when we let this schema loose on production, all kinds of interesting insights will come to the surface.
Epilogue: Lots of Slow ORM Queries
I'm a fan of writing SQL. Few things bring me more joy (and a sense of control over performance). Which is why I tend to shy away from ORMs: Object-Relational Mappers. ORMs provide a layer of abstraction over the database structures, which - in my opinion - makes it easier to write problematic SQL queries. And, to that end, I will say that I did see a lot of what looked like ORM-based SQL queries showing up with high latencies.
ORMs queries are usually easy to spot because they add a lot of "alias cruft" to the SQL statement, like this:
SELECT MyTable.id, MyTable.name FROM my_table AS MyTable ;
In this case, the
MyTable alias probably relates back to the "model name" used in the ORM definition.
I'm not trying to hate on ORMs here - I know people love them. It was just something that popped-out at me when looking through the
sys views. That said, a frustrating side-effect of using an ORM is that it makes the SQL queries harder to locate in the code because they are more generic. At least with a hand-written SQL statement, there are usually "hints" and "landmarks" that I can use in order to search-for the given query.