我做了一个MySQL性能优化测试,但是测试结果让我感到惊讶。
首先,我为我的测试准备了几个表,它们是“t_worker_attendance_300w(300万数据)、t_worker_attendance_1000w(1000万数据)、t_worker_attendance_1y(1亿数据)、t_worker_attendance_4y(4亿数据)”。
每个表都有相同的字段,相同的索引,它们被复制,包括4亿的数据量也从300万个数据增加。
据我理解,MySQL的性能肯定会受到数据量大小的严重影响,但结果已经困扰了我整整一周。我几乎已经测试了我能想到的场景,但是它们的执行时间是一样的!
这是一个新的MySQL 5.6.16服务器,我测试了我能想到的任何场景,包括内部连接.
A) SHOW CREATE TABLE t_worker_attendance_4y
CREATE TABLE `t_worker_attendance_4y` (
`id` bigint(20) NOT NULL ,
`attendance_id` char(32) NOT NULL,
`worker_id` char(32) NOT NULL,
`subcontractor_id` char(32) NOT NULL ,
`project_id` char(32) NOT NULL ,
`sign_date` date NOT NULL ,
`sign_type` char(2) NOT NULL ,
`latitude` double DEFAULT NULL,
`longitude` double DEFAULT NULL ,
`sign_wages` decimal(16,2) DEFAULT NULL ,
`confirm_wages` decimal(16,2) DEFAULT NULL ,
`work_content` varchar(60) DEFAULT NULL ,
`team_leader_id` char(32) DEFAULT NULL,
`sign_state` char(2) NOT NULL ,
`confirm_date` date DEFAULT NULL ,
`sign_mode` char(2) DEFAULT NULL ,
`checkin_time` datetime DEFAULT NULL ,
`checkout_time` datetime DEFAULT NULL ,
`sign_hours` decimal(6,1) DEFAULT NULL ,
`overtime` decimal(6,1) DEFAULT NULL ,
`confirm_hours` decimal(6,1) DEFAULT NULL ,
`signimg` varchar(200) DEFAULT NULL ,
`signoutimg` varchar(200) DEFAULT NULL ,
`photocheck` char(2) DEFAULT NULL ,
`machine_type` varchar(2) DEFAULT '1' ,
`project_coordinate` text ,
`floor_num` varchar(200) DEFAULT NULL ,
`device_serial_no` varchar(32) DEFAULT NULL ,
KEY `checkin_time` (`checkin_time`),
KEY `worker_id` (`worker_id`),
KEY `project_id` (`project_id`),
KEY `subcontractor_id` (`subcontractor_id`),
KEY `sign_date` (`sign_date`),
KEY `project_id_2` (`project_id`,`sign_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8B) SHOW INDEX FROM t_worker_attendance_4y
+------------------------+------------+------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------------+------------+------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_worker_attendance_4y | 1 | checkin_time | 1 | checkin_time | A | 5017494 | NULL | NULL | YES | BTREE | | |
| t_worker_attendance_4y | 1 | worker_id | 1 | worker_id | A | 1686552 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | project_id | 1 | project_id | A | 102450 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | subcontractor_id | 1 | subcontractor_id | A | 380473 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | sign_date | 1 | sign_date | A | 512643 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | project_id_2 | 1 | project_id | A | 102059 | NULL | NULL | | BTREE | | |
| t_worker_attendance_4y | 1 | project_id_2 | 2 | sign_date | A | 1776104 | NULL | NULL | | BTREE | | |
+------------------------+------------+------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+C) EXPLAIN SELECT SQL_NO_CACHE tw.project_id, tw.sign_date FROM t_worker_attendance_4y tw WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb' AND sign_date >= '07/01/2018' AND sign_date < '08/01/2018' ;
+----+-------------+-------+------+-----------------------------------+--------------+---------+-------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-----------------------------------+--------------+---------+-------+----------+--------------------------+
| 1 | SIMPLE | tw | ref | project_id,sign_date,project_id_2 | project_id_2 | 96 | const | 54134596 | Using where; Using index |
+----+-------------+-------+------+-----------------------------------+--------------+---------+-------+----------+--------------------------+他们都通过了相同的联合指数。
SELECT tw.project_id, tw.sign_date FROM t_worker_attendance_300w tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.02 secSELECT tw.project_id, tw.sign_date FROM t_worker_attendance_1000w tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.01 secSELECT tw.project_id, tw.sign_date FROM t_worker_attendance_1y tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.02 secSELECT tw.project_id, tw.sign_date FROM t_worker_attendance_4y tw
WHERE tw.project_id = '39235664ba734887b298ee568fbb66fb'
AND sgin_date >= '07/01/2018'
AND sgin_date < '08/01/2018' LIMIT 0,10000;
Execution time: 0.02 sec.
我的猜测是,随着数据量的增加,MySQL的查询性能将急剧下降,但两者并没有太大的不同。所以我无法优化我的查询。我不知道何时实现表分区计划或子数据库子表计划。
我想知道的是,为什么小数据量索引的执行速度与大数据量索引的执行速度相同。如果你能帮我,我非常感谢你。
发布于 2019-06-25 04:13:10
发布于 2019-06-25 07:14:13
我有了一个新的答案,有人告诉我“因为你的查询被索引覆盖了,索引实际上是查询索引的时间。Mysql索引使用B+树结构,在同一树高下的查询时间基本相同,可以计算这些表索引的树的高度是否相同。”
所以我按要求进行了调查。
mysql> SELECT b.name, a.name, index_id, type, a.space, a.PAGE_NO
-> FROM information_schema.INNODB_SYS_INDEXES a,
-> information_schema.INNODB_SYS_TABLES b
-> WHERE a.table_id = b.table_id AND a.space <> 0;
+-------------------------------------------------+---------------------+----------+------+-------+---------+
| name | name | index_id | type | space | PAGE_NO |
+-------------------------------------------------+---------------------+----------+------+-------+---------+
| mysql/innodb_index_stats | PRIMARY | 18 | 3 | 2 | 3 |
| mysql/innodb_table_stats | PRIMARY | 17 | 3 | 1 | 3 |
| mysql/slave_master_info | PRIMARY | 20 | 3 | 4 | 3 |
| mysql/slave_relay_log_info | PRIMARY | 19 | 3 | 3 | 3 |
| mysql/slave_worker_info | PRIMARY | 21 | 3 | 5 | 3 |
| test_gomeet/t_worker_attendance_1y | GEN_CLUST_INDEX | 45 | 1 | 12 | 3 |
| test_gomeet/t_worker_attendance_1y | checkin_time | 46 | 0 | 12 | 16389 |
| test_gomeet/t_worker_attendance_1y | project_id | 50 | 0 | 12 | 32775 |
| test_gomeet/t_worker_attendance_1y | worker_id | 53 | 0 | 12 | 49161 |
| test_gomeet/t_worker_attendance_1y | subcontractor_id | 54 | 0 | 12 | 65547 |
| test_gomeet/t_worker_attendance_1y | sign_date | 66 | 0 | 12 | 81933 |
| test_gomeet/t_worker_attendance_1y | project_id_2 | 408 | 0 | 12 | 98319 |
| test_gomeet/t_worker_attendance_300w | GEN_CLUST_INDEX | 56 | 1 | 13 | 3 |
| test_gomeet/t_worker_attendance_300w | checkin_time | 58 | 0 | 13 | 16389 |
| test_gomeet/t_worker_attendance_300w | project_id | 59 | 0 | 13 | 16427 |
| test_gomeet/t_worker_attendance_300w | worker_id | 60 | 0 | 13 | 16428 |
| test_gomeet/t_worker_attendance_300w | subcontractor_id | 61 | 0 | 13 | 16429 |
| test_gomeet/t_worker_attendance_300w | sign_date | 67 | 0 | 13 | 65570 |
| test_gomeet/t_worker_attendance_300w | project_id_2 | 397 | 0 | 13 | 81929 |
| test_gomeet/t_worker_attendance_4y | GEN_CLUST_INDEX | 42 | 1 | 9 | 3 |
| test_gomeet/t_worker_attendance_4y | checkin_time | 47 | 0 | 9 | 16389 |
| test_gomeet/t_worker_attendance_4y | worker_id | 49 | 0 | 9 | 32775 |
| test_gomeet/t_worker_attendance_4y | project_id | 52 | 0 | 9 | 49161 |
| test_gomeet/t_worker_attendance_4y | subcontractor_id | 55 | 0 | 9 | 65547 |
| test_gomeet/t_worker_attendance_4y | sign_date | 69 | 0 | 9 | 81933 |
| test_gomeet/t_worker_attendance_4y | project_id_2 | 412 | 0 | 9 | 98319 |
+-------------------------------------------------+---------------------+----------+------+-------+---------+mysql> SHOW GLOBAL STATUS LIKE 'Innodb_page_size';
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_page_size | 16384 |
+------------------+-------+root@localhost:/usr/local/mysql/data/test_gomeet# hexdump -s 49216 -n 02 t_worker_attendance_300w.ibd
000c040 0200
000c042
root@localhost:/usr/local/mysql/data/test_gomeet# hexdump -s 49216 -n 02 t_worker_attendance_1y.ibd
000c040 0300
000c042
root@localhost:/usr/local/mysql/data/test_gomeet# hexdump -s 49216 -n 02 t_worker_attendance_4y.ibd
000c040 0300
000c042计算结果表明:3.34亿,3.589为4亿。几乎一样。是因为这个吗?
https://stackoverflow.com/questions/56735178
复制相似问题