June 16, 2014, 10:30 AM —
MySQL is a powerful and performant database engine that often suffers from performance issues due to misconfiguration. Obvious culprits for bottlenecks are available memory and CPU resources but disk I/O can be a common problem as well.
The configuration options available for use in the my.cnf file of a MySQL installation can be mysterious. Even technical folks who work with the system on a daily basis routinely misinterpret the basic meaning of config variables (query_cache_limit vs. query_cache_size in particular). The default settings that come with a fresh installation might be fine for a few Wordpress sites on a single server, but if you’re hosting several virtual machines running MySQL server, each holding dozens of databases, you’ll likely run into serious performance issues. One of those issues is commonly known as disk thrashing or I/O thrashing.
As you probably know, RAM is fast for I/O and disks are comparatively slow. If every operation MySQL server needs can be fulfilled in memory, you’re going to have a good time. If every operation results in a read or write to disk, you’re going to have a bad time. In reality there will be a mix of both as data is persisted to disk, but in general you want to configure your server to do as much work as possible in memory.
Out of the box, MySQL is polite and will eat up very little memory until you unleash it on more. If you’ve got a database that is several GB and you get a few hundred concurrent visitors which result in a database query, but MySQL is only using 512MB of RAM, you will probably experience I/O thrashing. As you can imagine, if 200 connections are coming in and each request is looking for different data, the server will need to page in/out the data from the disk very rapidly since it can’t fulfill the requests from such little memory. There can be several causes and solutions for this problem.
One quick way to discover if you’re experiencing disk thrashing is to take a look at the server’s load value vs. its CPU utilization. If the load is high but the CPU utilization is low, the load (or queue of tasks waiting to be executed) may be caused by high disk I/O wait times.
In the example above of 200 simultaneous connections, the first step (ignoring content caching) will obviously be to allow MySQL access to more of the system’s RAM, provided it has more available. With multi-gigabyte databases, MySQL can use quite a bit of memory as it needs not just the space for the database itself, but also space to perform work on each request such as table joins, temp tables, caching, logging, and more. As the database size grows, so does the space needed for those operations. On top of that, the number of connections multiplies the number of open tables in the cache and so forth.
Instead of trying to provide my.cnf specifics since each scenario is different and I’m no MySQL expert, I’ll point you to helpful resource for creating a baseline my.cnf from Percona Tools and a perl script which offers tuning tips based on your actual database usage called mysqltuner.
In my experience, the two most helpful config changes to resolve disk thrashing in a virtualized environment have been innodb-flush-log-at-trx-commit and tmp_table_size. I can’t say that these settings will solve your issues if you have them, but in my case having many large instances running on the same hardware caused the primary disk array to peg at 100% utilization. By adjusting these two parameters, the utilization dropped to a steady 30%.
First, innodb-flush-log-at-trx-commit. By default the parameter is set to 1 which makes InnoDB ACID compliant. Since the databases I’m dealing with do not typically have critical transaction needs (blogs mostly), I’m able to relax this setting (2) or turn it completely off(0) with the knowledge that I could lose a second or so of data in the event of a crash. By disabling it, the server no longer writes to disk on every transaction saving a lot of I/O. You should not change this setting haphazardly. Read the full documentation here to see the options.
Next, tmp_table_size. When MySQL performs a query which requires a new combination of data to be held for reference (a JOIN operation for instance), the engine will create a temporary table. If the tmp_table_size and max_heap_table_size are too low, the temp tables will be created on the disk instead of in memory. If you’re running a lot of databases or very large ones, this can happen a lot. You should increase the values to allow more of those operations to take place in memory. I’ve seen a rule of thumb calculation as RAM / max connections for this value.
Performance tuning on MySQL is difficult. Every situation is unique so it’s not easy to give a clear answer. If you find yourself in a situation where I/O is spiking and the disk has become your bottleneck, take a look at the above mentioned parameters to see if they can help get you out of trouble.