不用MariaDB/Percona也能查看DDL的进度

导读

如何查看DDL的进度?

使用MariaDB/Percona版本的一个便利之处就是可以及时查看DDL的进度,进而预估DDL耗时。

其实,在官方版本里也是可以查看DDL进度的,认真看手册的同学就能发现手册中有提到过:

You can monitor ALTER TABLE progress for InnoDB tables using Performance Schema.

应该怎么做呢,我们来大概说下。
简言之,需要启用performance_schema,并设置2个地方:

[yejr@imysql]> UPDATE setup_instruments SET ENABLED = 'YES' WHERE NAME LIKE 'stage/innodb/alter%';

[yejr@imysql]> UPDATE setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE '%stages%';

现在,跑个DDL看看:

[yejr@imysql]> SELECT COUNT(*) FROM t1;
+----------+
| count(*) |
+----------+
|   799994 |
+----------+

[yejr@imysql]> ALTER TABLE t1 ADD c4 DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP;


在另一个SESSION中反复执行下面的SQL查看进度:

[yejr@imysql]> SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM events_stages_current;
+------------------------------------------------------+----------------+----------------+
| EVENT_NAME                                           | WORK_COMPLETED | WORK_ESTIMATED |
+------------------------------------------------------+----------------+----------------+
| stage/sql/Sending data                               |           NULL |           NULL |
| stage/innodb/alter table (read PK and internal sort) |           3464 |          31227 |
+------------------------------------------------------+----------------+----------------+

[yejr@imysql]> SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM events_stages_current;
+------------------------------------------------------+----------------+----------------+
| EVENT_NAME                                           | WORK_COMPLETED | WORK_ESTIMATED |
+------------------------------------------------------+----------------+----------------+
| stage/sql/Sending data                               |           NULL |           NULL |
| stage/innodb/alter table (read PK and internal sort) |          11760 |          31227 |
+------------------------------------------------------+----------------+----------------+

[yejr@imysql]> SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM events_stages_current;
+---------------------------------------+----------------+----------------+
| EVENT_NAME                            | WORK_COMPLETED | WORK_ESTIMATED |
+---------------------------------------+----------------+----------------+
| stage/sql/Sending data                |           NULL |           NULL |
| stage/innodb/alter table (merge sort) |          12888 |          31227 |
+---------------------------------------+----------------+----------------+

[yejr@imysql]> SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM events_stages_current;
+-----------------------------------+----------------+----------------+
| EVENT_NAME                        | WORK_COMPLETED | WORK_ESTIMATED |
+-----------------------------------+----------------+----------------+
| stage/sql/Sending data            |           NULL |           NULL |
| stage/innodb/alter table (insert) |          22432 |          31227 |
+-----------------------------------+----------------+----------------+

[yejr@imysql]> SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM events_stages_current;
+----------------------------------+----------------+----------------+
| EVENT_NAME                       | WORK_COMPLETED | WORK_ESTIMATED |
+----------------------------------+----------------+----------------+
| stage/sql/Sending data           |           NULL |           NULL |
| stage/innodb/alter table (flush) |          34076 |          34980 |
+----------------------------------+----------------+----------------+

[yejr@imysql]> SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM events_stages_current;
+------------------------+----------------+----------------+
| EVENT_NAME             | WORK_COMPLETED | WORK_ESTIMATED |
+------------------------+----------------+----------------+
| stage/sql/Sending data |           NULL |           NULL |
+------------------------+----------------+----------------+

最后,也可以查看 events_stages_history 里记录的完整过程:

[yejr@imysql]> SELECT EVENT_NAME, WORK_COMPLETED, WORK_ESTIMATED FROM events_stages_history;
+----------------------------------------------------+----------------+----------------+
| EVENT_NAME                                         | WORK_COMPLETED | WORK_ESTIMATED |
+----------------------------------------------------+----------------+----------------+
| stage/sql/System lock                              |           NULL |           NULL |
| stage/sql/optimizing                               |           NULL |           NULL |
| stage/sql/statistics                               |           NULL |           NULL |
| stage/sql/preparing                                |           NULL |           NULL |
| stage/sql/executing                                |           NULL |           NULL |
| stage/sql/cleaning up                              |           NULL |           NULL |
| stage/sql/starting                                 |           NULL |           NULL |
| stage/sql/checking permissions                     |           NULL |           NULL |
| stage/sql/Opening tables                           |           NULL |           NULL |
| stage/sql/init                                     |           NULL |           NULL |
| stage/sql/cleaning up                              |           NULL |           NULL |
| stage/innodb/alter table (log apply table)         |          35363 |          35363 |
| stage/sql/committing alter table to storage engine |           NULL |           NULL |
| stage/innodb/alter table (end)                     |          35363 |          35363 |
| stage/innodb/alter table (log apply table)         |          35747 |          35747 |
| stage/sql/end                                      |           NULL |           NULL |
| stage/sql/query end                                |           NULL |           NULL |
| stage/sql/closing tables                           |           NULL |           NULL |
| stage/sql/freeing items                            |           NULL |           NULL |
| stage/sql/logging slow query                       |           NULL |           NULL |
+----------------------------------------------------+----------------+----------------+

从上面的结果我们也能看到,一个DDL执行过程包括下面几个主要阶段:

  1. stage/innodb/alter table (read PK and internal sort),读取主键(聚集索引),计算需要处理的data page数;
  2. stage/innodb/alter table (merge sort),处理ALTER TABLE影响的索引,每个索引跑一次(含主键索引);
  3. stage/innodb/alter table (insert),同上;
  4. stage/innodb/alter table (log apply index),将执行DDL期间新增的DML操作应用到index上;
  5. stage/innodb/alter table (flush),flush阶段;
  6. stage/innodb/alter table (log apply table),将执行DDL期间新增的DML操作应用到table上;
  7. stage/innodb/alter table (end),收尾阶段。

再说下利用P_S查看ALTER TABLE 进度的局限性:

  • 只支持MySQL 5.7+的版本;
  • 只支持InnoDB引擎表;
  • 不支持spatial indexes。

最后,我们可以运行下面的SQL,查看当前所有ALTER TABLE进度及其对应的DDL SQL:

[yejr@imysql]> SELECT ec.THREAD_ID, ec.EVENT_NAME, ec.WORK_COMPLETED, ec.WORK_ESTIMATED, pt.STATE, pt.INFO FROM performance_schema.events_stages_current ec left join performance_schema.threads th on ec.thread_id = th.thread_id left join information_schema.PROCESSLIST pt on th.PROCESSLIST_ID = pt.ID where pt.INFO like ‘ALTER%’\G
*************************** 1. row ***************************
     THREAD_ID:  105
    EVENT_NAME:  stage/innodb/alter table (merge sort)
WORK_COMPLETED:  14032
WORK_ESTIMATED:  33999
         STATE:  altering table
          INFO:  ALTER TABLE t1 DROP c4

文中案例的MySQL版本:5.7.16。

参考

[MySQL FAQ]系列 — pt-table-checksum工具使用报错一例

Percona Tollkit
图片来自Percona官网

今天同事在用 percona toolkit 工具中的 pt-table-checksum 对主从数据库进行校验,提交命令后,一直提示下面的信息:

Pausing because Threads_running=0

看字面意思是在提示当前活跃线程数为0,但为什么不继续执行呢。这个提示信息有点含糊其辞,该工具是用Perl写的,因此直接打开看脚本跟踪一下,大概就明白怎么回事了,原来是这个工具有负载保护机制,避免运行时对线上数据库产生影响。

和这个机制相关的参数名是: –max-load,其类型是:Array,用法是一个或多个 variables = value 组成的判断条件,然后根据这个规则判断某些条件是否超标。例如,设定 –max-load=”Threads_running=25″,意思是当前活跃线程数如果超过25,就暂停 checksum 工作,直到活跃线程数低于 25。

因此,在我们这个案例中,想要强制让 table-checksum 继续工作的话,可以设定 –max-load 的值,例如:

pt-table-checksum --max-load="Threads_running=25" ...其他选项...

或者

pt-table-checksum --max-load="Threads_connected=25" ...其他选项...

 

前面的选项意思是判断活跃线程数不要超过25个,后面的选项意思是当前打开的线程数不要超过25个。

下面是 pt-table-checksum 帮助手册里的一段话:

–max-load
type: Array; default: Threads_running=25; group: Throttle

Examine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than the threshold. The option accepts a comma-sep-
arated list of MySQL status variables to check for a threshold. An optional “=MAX_VALUE” (or “:MAX_VALUE”) can follow each variable. If not
given, the tool determines a threshold by examining the current value and increasing it by 20%.

For example, if you want the tool to pause when Threads_connected gets too high, you can specify “Threads_connected”, and the tool will check
the current value when it starts working and add 20% to that value. If the current value is 100, then the tool will pause when Threads_con-
nected exceeds 120, and resume working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use
either “Threads_connected:110” or “Threads_connected=110”.

The purpose of this option is to prevent the tool from adding too much load to the server. If the checksum queries are intrusive, or if they
cause lock waits, then other queries on the server will tend to block and queue. This will typically cause Threads_running to increase, and the
tool can detect that by running SHOW GLOBAL STATUS immediately after each checksum query finishes. If you specify a threshold for this vari-
able, then you can instruct the tool to wait until queries are running normally again. This will not prevent queueing, however; it will only
give the server a chance to recover from the queueing. If you notice queueing, it is best to decrease the chunk time.

MySQL出了门,Percona在左,MariaDB在右

MySQL vs Percona vs MariaDB

前言:作为没有自主研发能力的企业,MySQL、Percona、MariaDB到底该选择哪个?

今天看到SOHU-DBA公众号推送的文章:《MySQL分支的选择:Percona还是MariaDB》,原文出处:http://www.biaodianfu.com/mysql-percona-or-mariadb.html,这个文章比较长,有兴趣的可以找到原文看看,我也来简单说下我的看法。

就目前而言,我个人优先推荐使用Percona分支版本,它和官方版本相对是最近的,迁移切换的代价也很小。我从2008年开始就使用Percona分支版本到现在了,当初我们可是国内最大的Percona免费用户。最早只是在官方版本上打了一些补丁,现在已经发展形成了自己的XtraDB引擎,提供PXC高可用解决方案,并且附带了percona-toolkit等DBA管理工具箱,非常方便。

而MariaDB分支则做了较大改动,虽然也集成了XtraDB引擎的大多数优势,但从官方版本迁移过去,不确定因素等风险还是挺大的,比如GTID不兼容、查询优化器个别时候更糟糕等等,不过MariaDB还是挺值得期待的。

综上,建议先继续使用Percona分支,等MariaDB分支相对更成熟了再用不迟,对这两个分支都不感兴趣或不敢用的,再最后选择官方版本吧,把我上面的话当耳边风就行了,但我要告诉大家的是,在官方版本上能遇到奇葩的事情更多,尤其是查询优化器,快吐血了。

Percona Thread Pool性能基准测试

MySQL从5.5.16开始,在MySQL的商业化版本中将Thread Pool作为plugin提供官方功能支持。后来MariaDB也实现了这一功能,Percona也跟进实现了。从这几天对Percona 5.6.16版本做了下thread pool对比测试,试图找到较为合适的配置参数。

下面是几个测试模式对比:

模式 配置参数
Percona 5.6.16-nothp 未开启 thread pool 模式
CASE0-thp(128)-oversub(16)-max(2048) thread_handling = pool-of-threads
thread_pool_size = 128
thread_pool_oversubscribe = 16
thread_pool_max_threads = 2048
CASE1-thp(default) thread_handling = pool-of-threads
其他默认设置
CASE2-thp(default)-oversub(10) thread_handling = pool-of-threads
thread_pool_oversubscribe = 10
其他默认设置
CASE3-thp(default)-oversub(10)-max(10000) thread_handling = pool-of-threads
thread_pool_oversubscribe = 10
thread_pool_max_threads = 100000
其他默认设置
CASE4-thp(default)-oversub(16) thread_handling = pool-of-threads
thread_pool_oversubscribe = 16
其他默认设置
CASE5-thp(128)-oversub(16)-max(100000) thread_handling = pool-of-threads
thread_pool_size = 128
thread_pool_oversubscribe = 16
thread_pool_max_threads = 100000

仍然采用tpcc-mysql这个测试工具,基准值:

测试Warehouse数: 100
warmup time: 60s
run time: 1200s
并发线程数: 64 ~ 1920

测试环境信息:

测试机 DELL PE R710
CPU E5620  @ 2.40GHz(4 core, 8 threads, L3 Cache 12 MB) * 2
内存 32G(4G * 8)
RAID卡 PERC H700 Integrated, 512MB, BBU, 12.10.1-0001
系统 Red Hat Enterprise Linux Server release 6.4 (Santiago)
内核 2.6.32-358.el6.x86_64 #1 SMP
raid级别 raid 0
文件系统 xfs
硬盘 SSD: Intel 520系列SSD, 800G * 1

Percona版本号:5.6.16-64.2-rel64.2-log Percona Server with XtraDB (GPL), Release rel64.2, Revision 569,Percona相关的关键配置有:

innodb_buffer_pool_size = 26G
innodb_flush_log_at_trx_commit = 1

测试脚本可参考:MySQL压力测试经验

测试结果见下:

Percona-Thread-Pool测试-20140701.png

针对这个测试结果,我们可以得到一些结论:

 

1、通常地,只需要开启 pool-of-threads 模式就可以;
2、可以根据实际压力情况,适当调整 thread_pool_oversubscribe 选项以提升 TPS,这个选项值设置范围一般在 3~20;
3、thread-pool-size默认值是逻辑CPU个数,最大值是 128,不建议调整或显式设置,如果显式设定 thread-pool-size 的值,可能会带来反效果;
4、thread_pool_max_threads 默认值是 100000,强烈不建议修改。

综上,对于Thread Pool,我们一般建议设置下面2个选项就足够了:

thread_handling = pool-of-threads
thread_pool_oversubscribe = 10 #这个值建议在3~20间,不清楚的话,无需设置

备注:启用Thread Pool后,想要终止某个查询的话,要这么写KILL QUERY connection_id,而不是写成 KILL connection_id,否则就会导致整个连接被KILL。

如果还有什么问题,欢迎加入我的QQ群(272675472)讨论。

MySQL 5.6.17/Percona5.6.16/MariaDB 10.0.11/OneSQL 5.6.16压测瓶颈分析

之前我进行了MySQL 5.6.17/Percona5.6.16/MariaDB 10.0.11/OneSQL 5.6.16对比基准TPCC压测,从测试结果可以看到在高并发(并发1920线程)模式下,MariaDB的相对优势,也看到了在一般并发场景(并发64线程)模式下,MariaDB拥有绝对优势。

今天我们就来看看这两种模式下,系统负载等性能指标表现,以及各自的瓶颈在哪里,也就能知道为何有这么大差异了。

首先,我们看下并发64线程的对比图表:

MySQL-Percona-MariaDB-perf-data-under-64th

再看下并发1920线程的对比图表:

MySQL-Percona-MariaDB-perf-data-under-1920th

从上面两个图可以看出来几点信息:

结论:
1、并发64线程时,MySQL的瓶颈在 spin_lock,所以 %SYS 跑的很高,TpmC也上不去;
2、并发64线程时,Percona次要瓶颈也是 spin_lock,相比之下 %SYS 也较高,TpmC上不去;
3、并发1920线程时,spin_lock 都是最大的瓶颈,MySQL和Percona的次要瓶颈是lock_rec_has_to_wait_in_queue()函数,因此相对的TpmC也跑不高;