We are always looking for interesting articles about MySQL! Have you written something and would like to it published here? Please contact us via feedback form »

What's New in MySQL 5.6

With MySQL 5.5 generally available and being deployed all over the planet, the architects and strategists can start looking ahead to the exciting new thing with SQL in its name: MySQL 5.6.

MySQL 5.6 builds on the momentum of 5.5, and Oracle's investment and commitment to MySQL, by delivering better performance and scalability.

At this year's MySQL Conference & Expo, you'll hear about:

Optimizer improvements for all-around query performance.
InnoDB improvements for higher transactional throughput.
New NoSQL-style memcached APIs.
Partitioning improvements for querying and managing huge tables.
Replication improvements covering many aspects.
Better performance monitoring by expanding the data available through the PERFORMANCE_SCHEMA.

Here are the features that you will get as a graduated series of MySQL 5.6 development milestone releases. The first such milestone release, MySQL 5.6.2, is available for download right now at http://dev.mysql.com/downloads/. Start beta testing the 5.6 release now and give feedback to the MySQL engineering team.

Performance Improvements

Optimizer Improvements

Index Condition Pushdown

Moves more of the processing for WHERE clauses to the storage engine. Instead of fetching entire rows to evaluate against a set of WHERE clauses, ICP sends those clauses to the storage engine, which can prune the result set by examining index tuples. The result is less I/O overhead for the base table, and less internal communication overhead for the server and the storage engine. This feature works with InnoDB, MyISAM, and NDBCLUSTER tables. Read more about index condition pushdown.

Multi-Range Read

Until the day when you have all the SSDs you want, it's faster to read data sequentially from disk than to do random accesses. For secondary indexes, the order for the index entries on disk is different than the order of disk blocks for the full rows. Instead of retrieving the full rows using a sequence of small out-of-order reads, MRR scans one or more index ranges used in a query, sorts the associated disk blocks for the row data, then reads those disk blocks using larger sequential I/O requests. The speedup benefits operations such as range index scans and equi-joins on indexed columns. (Think InnoDB foreign keys.) Works all storage engines. Read more about multi-range read.

File Sort Optimization

For queries that combine ORDER BY non_indexed_column and a LIMIT x clause, this feature speeds up the sort when the contents of X rows can fit into the sort buffer. Works with all storage engines.

InnoDB Improvements

MySQL 5.6 builds on the platform of InnoDB fully integrated as the default storage engine, which happened way back in MySQL 5.5.

Persistent Optimizer Stats

Provides improved accuracy of InnoDB index statistics, and consistency across MySQL restarts. InnoDB precomputes statistics that help the optimizer decide which indexes to use in a query, by sampling a portion of the index. You can adjust the amount of sampling that InnoDB does for each index. The resulting statistics can now persist across server restarts, rather than being recomputed (and possibly changing) due to restarts and some runtime events. The more accurate statistics can improve query performance, and the persistence aspect can keep query performance stable. This feature is controlled by the configuration options innodb_analyze_is_persistent, innodb_stats_persistent_sample_pages, and innodb_stats_transient_sample_pages. When the persistent stats feature is enabled, the statistics are only recomputed when you explicitly run ANALYZE TABLE for the table. Read more about Persistent Optimizer Stats.

New INFORMATION_SCHEMA Tables

Once InnoDB information was made available for queries through the INFORMATION_SCHEMA in MySQL 5.5, people clamored for more kinds of status and monitoring information. The SQL interface is more standardized and predictable than parsing the text output from SHOW STATUS commands.

Metrics table: Provides a generic and comprehensive resource and performance monitoring framework for InnoDB. The new I_S table is INNODB_METRICS.
System Tables: Makes the InnoDB internal data dictionary available for SQL queries, for convenience of monitoring. The new I_S tables are INNODB_SYS_TABLES, INNODB_SYS_TABLESTATS, INNODB_SYS_INDEXES, INNODB_SYS_COLUMNS, INNODB_SYS_FIELDS, INNODB_SYS_FOREIGN, and INNODB_SYS_FOREIGN_COLS.
Buffer Pool Information table: Displays buffer pool page information for tuning on large-memory or highly loaded systems. (Highly requested by customers and community users.) The new I_S tables are INNODB_BUFFER_PAGE, INNODB_BUFFER_PAGE_LRU, and INNODB_BUFFER_POOL_STATS.

Additional Optimizer Support

InnoDB supports the ICP, MRR, and file sort optimizer features mentioned earlier.

Split Kernel Mutex

The InnoDB team continues to remove bottlenecks for busy systems. Now instead of a single mutex controlling concurrent access to core InnoDB operations, several more fine-grained mutexes and rw-locks reduce contention on a busy system. In particular, the subsystems for transaction control, MVCC views, and locking now all have separate mutexes or rw-locks. Read more about the kernel mutex split.

Multi-Threaded Purge

InnoDB now has multiple purge threads, making purge operations across multiple tables more efficient. Originally, purge operations were handled by the InnoDB master thread, leading to contention with other InnoDB operations; in MySQL 5.5, purge was moved into a single separate thread, and in MySQL 5.6 you can set innodb_purge_threads to a value greater than 1. Read more about multiple purge threads.

Separate Flush Thread

A separate http://dev.mysql.com/doc/refman/5.6/en/glossary.html#glos_flush">flush thread (page_cleaner) makes flushing operations more efficient. This operation was formerly controlled by the InnoDB master thread, leading to contention with other InnoDB operations. Read more about the separate flush thread.

Pruning the InnoDB Table Cache

To ease the memory load on systems with huge numbers of tables, InnoDB now frees up the memory associated with an opened table. An LRU algorithm selects tables that have gone the longest without being accessed. To reserve more memory for open tables, increase the value of the --table_definition_cache=# configuration option. Read more about the InnoDB table cache.

NoSQL Interface via `memcached`

The ever-increasing performance demands of web-based services has generated significant interest in providing NoSQL access methods to MySQL – maintaining all of the advantages of your existing relational database infrastructure, while providing blazing fast performance for simple queries, using an API to complement regular SQL access to your data.

Using the memcached API, web services can now directly access the InnoDB storage engine without transformations to SQL, ensuring low latency and high throughput for read/write queries. Operations such as SQL parsing are eliminated and more of the server's hardware resources (CPU, memory and I/O) are dedicated to servicing the query within the storage engine itself.

By using memcached, developers and DBAs are able to:

Preserve investments in memcached infrastructure by reusing existing memcached clients and eliminating the need for application changes.
Access the full range of memcached client libraries and platforms, providing maximum deployment flexibility and consistently high performance across all supported environments.
Extend memcached functionality by integrating a persistent, crash-safe, transactional database back-end offering ACID compliance.

The InnoDB memcached API is available in Labs.mysql.com. Read more about memcached API for InnoDB.

The MySQL Cluster 7.2 Development Milestone Release also previews a memcached interface for the NDBCLUSTER storage engine. Read more about the NoSQL interface for NDBCLUSTER.

Scalability Improvements

Data gets bigger. That's a fact of life. Partitioned tables help to segment that data into manageable portions, while keeping the convenience of a SQL interface to the full table. These new features make certain operations with partitions faster and more convenient.

Partitioning Improvements

Explicit Partition Selection

With partitioned tables, MySQL can restrict processing to only the relevant portions of a big data set. Now you can directly define which partitions are used in a query, DML, or data load operation, rather than repeating all the partitioning criteria in each statement.

SELECT * FROM employees PARTITION (p0, p2);
DELETE FROM employees PARTITION (p0, p1);
UPDATE employees PARTITION (p0) SET store_id = 2 WHERE fname = 'Jill';
SELECT e.id, s.city FROM employees AS e JOIN stores PARTITION (p1) AS s ...;

Replication Improvements

Replication is a fruitful area for enhancements to data integrity, availability, and let's not forget performance.

Replication Improvements to Performance

Optimized Row-Based Replication

By only replicating partial before and after images for INSERT, UPDATE and DELETE events where primary keys or explicit columns were set in the SQL statement, performance can be increased while binary log disk space, network resources and server memory footprint are reduced.

Multi-Threaded Slaves

Replication performance is improved by using multiple execution threads to apply replication events to slave servers. The multi-threaded slave splits work between worker threads based on the database name, allowing updates to be applied in parallel rather than sequentially.

As a result, replication throughput is increased and latency is reduced which minimizes the risk of replication lag, enabling slaves to serve the freshest updates to the application.

The multi-threaded slave functionality is available now as part of the snapshot-next-mr-wl5563-labs build which can be downloaded from Labs.mysql.com.

Replication Improvements to Data Integrity

Crash-Safe Slaves

Extends the robustness and ease-of-use of MySQL replication by making the slaves crash-safe when using transactional storage engines such as InnoDB.

The slave can automatically recover from a failure and resume replicating DML updates, without the DBA having to access the master.info and relaylog.info files to manually roll back replication to the last successfully committed transaction, or to skip transactions.

As a result, data integrity is enhanced and DBAs can be free to concentrate on more strategic data management activities.

Replication Checksums

Ensures the integrity of data being replicated to a slave by detecting data corruption and returning an error, preventing the slave itself from becoming corrupt.

Checksums are implemented in the binary and relay logs as well as to individual replication events, allowing errors to be detected whether they are caused by memory, disk or network failures, or by the database itself. Checksum checking can be implemented on a per-slave basis, giving maximum flexibility in how and where it is deployed.

Time-Delayed Replication

You can define a time delay for events to be replicated from a master to each slave, defined in millisecond increments up to a maximum of 68 years!

Time-Delayed Replication affords protection against operational errors made on the master, for example accidently dropping tables, in which event the slave can be promoted to the new master in order to restore the database to its previous state. Time-Delayed Replication can also be useful in testing application behavior by emulating any instances of replication lag.

Time-Delayed Replication is implemented at the per-slave level (via holding execution of the SQL_THREAD), so you could configure multiple slaves to apply replication events immediately, and another slave to apply only after a delay of 5 minutes, therefore providing deployment flexibility.

Replication Improvements to Usability

Informational Log Events

Enhances auditing and debugging when using Row-Based Replication by writing the original query to the binary log, which is then replicated with its associated row-based event to the slave.

Remote Binlog Back-up

Enhances operational efficiency by using the replication channel to create real-time back-ups from the binary log.

By adding a raw flag, the binlog is written out to remote back-up servers, without having a MySQL database instance translating it into SQL statements, and without the DBA needing SSH access to each master server.

Server UUIDs

Automatically generates a Universally Unique Identifier (UUID) for each server, allowing MySQL Enterprise Monitor or any other monitoring tool to retrieve information about master and slave servers in a replication configuration. The UUID is available through a SQL query and in the output of the SHOW SLAVE STATUS command. This technique requires fewer database connections and works better with servers that are monitored remotely or that use virtual IP addresses. This feature is especially useful in large and highly dynamic replication environments, making auto-discovery more reliable and simplifying systems management.

Instrumentation and Diagnostic Improvements via `PERFORMANCE_SCHEMA`

MySQL 5.6 greatly enhances the PERFORMANCE_SCHEMA features for performance monitoring and tuning. The information in the performance_schema tables lets you see how various low-level items factor into overall database performance, which ones are the hottest under various workloads and system configurations, and trace issues back to the relevant file and line in the source code so you can really see what's happening behind the scenes. Read more about Performance Schema.

Reduced Overhead

The performance_schema code is further tuned in 5.6 to reduce the performance overhead of running with this feature enabled.

Table I/O and Instrumentation

The setup_object table enables a DBA to understand, analyze, and tune how an application generates I/O load related to table access on the MySQL database server.

Table Locks Instrumentation

The setup_object table enables a DBA to identify hot tables and other object-related bottlenecks caused by the data access patterns of an application.

Session/User Level Instrumentation

The setup_actors table enables a DBA to monitor the load generated from a specific user or application by selectively instrumenting specific end user/application connections. This is valuable to narrow down the monitoring data collected: a) by thread or session, b) by connections originating from a given user account, c) by connections originating from a given machine, d) by connections originating from a given user account on a given machine. Reducing the amount of instrumented connections also reduces the runtime overhead of the instrumentation in production.

Global Performance Metric Summaries

Globally aggregates the data collected by the PERFORMANCE_SCHEMA by thread / by object / by instrumentation point. Enables DBA to write monitoring scripts and applications.

Table/Index I/O Summary

The setup_object table aggregates table I/O data collected by the PERFORMANCE_SCHEMA, by index, by table. Aggregation shows which tables or indexes are hot and most used by an application, and helps to identify the application data access patterns.

Table Lock Wait Summary

Aggregates Table lock data collected by the PERFORMANCE_SCHEMA by table. This aggregation shows which hot tables are often locked by an application, and helps to identify the application bottlenecks caused by table locking.

Statement-Level Instrumentation

Enables a DBA to monitor statement execution, collect per-statement metrics, and analyze statements/executions by end user session or global aggregates. Quantify which statements are generated by an application, execution times/counts and access paths. Metrics reported give insight on the data volumes, the selectivity of WHERE clauses, and index usage patterns. Available in Labs.mysql.com.

Show Contents of Host Cache - centralized logging of connection errors

(Highly requested by customers and community users. Delivered via community code contribution.) Provides centralized logging of connection-related errors. MySQL internal host cache has been instrumented and exposed in a relational table for SQL access, providing a centralized logging of connection errors, with details about the exact root cause (SSL, DNS, Authentication plugins, ...) This enables a DBA to easily troubleshoot large deployments involving numerous users and heterogeneous applications. Available from Labs.mysql.com.

Next Steps:

Now that you have read about all the exciting new performance and scalability improvements, it's your turn to take MySQL 5.6 for a spin:

Download MySQL 5.6: http://dev.mysql.com/downloads/. For the more experimental features listed here, get them at http://labs.mysql.com.
Read the MySQL 5.6 docs: http://dev.mysql.com/doc/
Blog about your experience: http://planet.mysql.com/
Join the discussion: http://forums.mysql.com/