Speed up your MySQL database

Fast Track

InnoDB Parameters

Now I'll move on to the most critical parameters for the InnoDB storage engine.

  • You can set the InnoDB Buffer Pool using the innodb_buffer_pool_size parameter. This defines how much RAM the InnoDB requests from the operating system to cache InnoDB data and index blocks. RAM is around 100,000 times faster than a disk; thus, you can significantly accelerate your queries if the data resides in RAM instead of on your hard disk. The tendency here is that more is better. As a rule of thumb, you will want to assign around 80 percent of your free RAM for the InnoDB Buffer Pool.
  • The innodb_log_file_size parameter defines the size of the InnoDB transaction logs, of which there are normally two. The size of these two files influences the speed of write access to InnoDB. This value was far too small for many years (5MB). The new default values in MySQL  5.6 take this into account, changing it to 48MB. The tendency is to assume that bigger means better write performance, but it also means a longer recovery time in the case of a system crash. Caution: If you have MySQL older than v5.5, check the documentation to discover how to change the size of this file. If you get things wrong, your database might not start at all.
  • The innodb_flush_log_at_trx_commit parameter defines how data is written to disk in the case of a COMMIT. The default value 1 stands for the cleanest behavior, which is also ACID-compliant. That means MySQL writes the data from the log buffer to the transaction logs on a COMMIT and then performs a flush (fsync). In the case of a system crash, this ensures that you do not lose any data. This safe setting also has some drawbacks: Flushing is slow, and it affects writes throughput of the system. If write performance is more important than minor data loss from a crash (ca. one second), then you can set the value of innodb_flush_log_at_trx_commit to 0 (empty and flush the log buffer once a second) or  2 (empty the log buffer on each COMMIT and write out to disk once a second). This will give you a significant increase in the write throughput.

A general overview of MySQL server and MySQL status variables can be found online [1].

Level 4: Application Tuning

The fourth tuning level relates to the application, which primarily means query tuning in the case of databases. This is where you can achieve the greatest performance gains. Unfortunately, doing so typically means changing the design, or at least the code, of the application, which is not always possible and not always well received by developers.

Even without intervening in the code, however, you can often achieve performance gains through intelligent indexing. Indexes mainly serve to accelerate read access (SELECT) to data. In many cases, indexes are also suitable for finding rows more quickly in UPDATE or DELETE operations. You can best imagine an index as being something like a phone book (city, family name, first name) or register in the library (order by title, author, etc.).

If you are looking for a person based on city, family name, and first name, this will be a fairly quick process in the phone book. However, if you do not know the city, your search will be extremely time consuming – you would need to search through every single phone book from the start to the finish. The MySQL database has a similar problem: If an index is missing, it triggers a computationally expensive full-table scan; if the first attribute in the index is unknown, it leads to a full-index scan.

If you have an index, the query can access the correct row in the table directly via the index entry or a range of index entries. By creating the right indexes, users can help the database avoid a huge amount of work, and this often leads to 10 or 50 times faster query performance.

Which Indexes Do You Need?

Each table should have a unique row identifier (ID). MySQL often uses an AUTO_INCREMENT field as the primary key for this. If the table does not have a primary key, it is not very elegant from an academic point of view, but it also affects performance (e.g., during replication).

In the case of InnoDB tables, you need to pay particular attention to the selection of the primary key. The length is relevant: The primary key is also stored as a reference in all secondary keys (all indexes of a table that are not the primary key). If you have a long primary key, you are thus increasing the size of all secondary keys. At the same time, the primary key often sees very intensive use in the context of joins. A long primary key plus a join leads to a correspondingly large number of CPU cycles.

Another consideration when selecting the primary key is the locality (the physical location) of the data. InnoDB always creates what is known as an index-organized table, which means the data in the blocks are always physically sorted in the way that the primary key is structured. This causes a corresponding sorting pattern of the data, which can be advantageous or disadvantageous depending on the use case; consider time series, for example.

The example in Listing 7 generates two tables: One uses the AUTO_INCREMENT attribute for the primary key, id, which instructs MySQL to assign sequence numbers to rows automatically. The second assigns a random primary key, which is equivalent to randomized rows. The first version is useful for typical access patterns in MySQL, because by sorting with newest first, you can find the most recently generated rows. The second primary key does not work well in this case.

Listing 7

Two Primary Keys

01 mysql> CREATE TABLE data_time (id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, data VARCHAR(64), ts TIMESTAMP);
02
03 mysql> CREATE TABLE data_random (id CHAR(32) NOT NULL PRIMARY KEY, data VARCHAR(64), ts TIMESTAMP);
04
05 mysql> INSERT INTO data_time VALUES ( NULL, 'Row 1', NULL); SELECT SLEEP(1);
06 mysql> INSERT INTO data_time VALUES ( NULL, 'Row 2', NULL); SELECT SLEEP(1);
07 mysql> INSERT INTO data_time VALUES ( NULL, 'Row 3', NULL); SELECT SLEEP(1);
08
09 mysql> INSERT INTO data_random VALUES ( MD5(CURRENT_TIMESTAMP()), 'Row 1', NULL); SELECT SLEEP(1);
10 mysql> INSERT INTO data_random VALUES ( MD5(CURRENT_TIMESTAMP()), 'Row 2', NULL); SELECT SLEEP(1);
11 mysql> INSERT INTO data_random VALUES ( MD5(CURRENT_TIMESTAMP()), 'Row 3', NULL); SELECT SLEEP(1);
12
13 mysql> CREATE TABLE `data_time` (
14   `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
15   `data` VARCHAR(64) DEFAULT NULL,
16   `ts` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
17   PRIMARY KEY (`id`)
18 ) ENGINE=InnoDB AUTO_INCREMENT=4
19
20 mysql> CREATE TABLE `data_random` (
21   `id` CHAR(32) NOT NULL,
22   `data` VARCHAR(64) DEFAULT NULL,
23   `ts` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
24   PRIMARY KEY (`id`)
25 ) ENGINE=InnoDB DEFAULT
26
27 mysql> SELECT * FROM data_time;
28 +----+---------+---------------------+
29 | id |  data   | ts                  |
30 +----+---------+---------------------+
31 |  1 |  Row 1  | 2015-09-13 15:24:05 |
32 |  2 |  Row 2  | 2015-09-13 15:24:06 |
33 |  3 |  Row 3  | 2015-09-13 15:24:07 |
34 +----+---------+---------------------+
35
36 mysql> SELECT * FROM data_random;
37 +----------------------------------+---------+---------------------+
38 | id                               |  data   | ts                  |
39 +----------------------------------+---------+---------------------+
40 | 47e0142a3638fdc24fe40d4e4fbce3f1 |  Row 1  | 2015-09-13 15:24:12 |
41 | b833c1e4c5bfc47d0dbe31c2e3f30837 |  Row 3  | 2015-09-13 15:24:14 |
42 | c7d46523a316de4e1496c65c3cbdf358 |  Row 2  | 2015-09-13 15:24:13 |
43 +----------------------------------+---------+---------------------+

Another problem relating to primary keys occurs with time series (Listing  8). The query does not typically reference the time alone, but the time per device. A primary key based only on id would be a poor choice here and is equivalent to random distribution in terms of the query logic. A composite index (device, ts) would re-sort the data (by devices and by time within the devices), substantially accelerating the queries.

Listing 8

Primary Keys and Time Series

01 mysql> CREATE TABLE `data_timeseries` (
02   `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
03   `device` varchar(32) NOT NULL,
04   `data` VARCHAR(64) DEFAULT NULL,
05   `ts` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
06   PRIMARY KEY (`id`)
07 ) ENGINE=InnoDB AUTO_INCREMENT=4
08 ;
09
10 mysql> INSERT INTO data_timeseries VALUES (NULL, 'Refrigerator, '5.8 C', NULL);
11 mysql> INSERT INTO data_timeseries VALUES (NULL, 'Washing machine, '41.2 C', NULL);
12 ...
13
14 mysql> SELECT * FROM data_timeseries WHERE device = 'Refrigerator AND ts
15 BETWEEN '2015-09-13 00:00:00' AND '2015-09-13 23:59:59'
16
17 +----+--------------------------+-----------+---------------------+
18 | id | device                   | data      | ts                  |
19 +----+--------------------------+-----------+---------------------+
20 |  4 | Refrigerator             | 5.8 C     | 2015-09-13 15:34:42 |
21 |  5 | Washing machine          | 41.2 C    | 2015-09-13 15:34:42 |
22 |  6 | Pot plant moisture       | 75% rel.  | 2015-09-13 15:34:42 |
23 |  7 | Refrigerator             | 6.0 C     | 2015-09-13 15:35:42 |
24 |  8 | Washing machine          | 41.1 C    | 2015-09-13 15:35:42 |
25 |  9 | Pot plant moisture       | 74% rel.  | 2015-09-13 15:35:42 |
26 | 10 | Refrigerator             | 6.2 C     | 2015-09-13 15:36:42 |
27 | 11 | Washing machine          | 41.2 C    | 2015-09-13 15:36:42 |
28 | 12 | Pot plant moisture       | 73% rel.  | 2015-09-13 15:36:42 |
29 +----+--------------------------+-----------+---------------------+

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus