工作中遇到的99%SQL优化，这里都能给你解决方案(二)

程序员小强

预计阅读时间：9分钟

小强介绍了sql的优化第一篇，并把第一篇文章链接贴到下方。今天带来order by和group by的优化，同时会介绍Mysql支持两种方式的排序filesort和index。工作中遇到的99%SQL优化，这里都能给你解决方案

-- 示例表
CREATE TABLE `employees` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(24) NOT NULL DEFAULT '' COMMENT '姓名',
`age` int(20) NOT NULL DEFAULT '0' COMMENT '年龄',
`position` varchar(20) NOT NULL DEFAULT '' COMMENT '职位',
`hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '入职时间',
PRIMARY KEY (`id`),
KEY `idx_name_age_position` (`name`,`age`,`position`) USING BTREE,
KEY `idx_age` (`age`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=136326 DEFAULT CHARSET=utf8 COMMENT='员工表'

Order by与Group by优化

EXPLAIN select * from employees WHERE name='LiLei' and position='dev' order by age;

利用最左前缀法则：中间字段不能断，因此查询用到了name索引，从key_len=74也能看出，age索引列用在排序的过程中，因为Extra字段里没有using filesort。

EXPLAIN select * from employees WHERE name='LiLei' order by position;

从explain的执行结果来看：key_len=74, 查询使用name索引，由于用了position进行排序，跳过了age，出现了Using filesort。

EXPLAIN select * from employees WHERE name='LiLei' order by age,position;

查找只用到了name索引，age和position用于排序，无Using filesort。

EXPLAIN select * from employees WHERE name='LiLei' order by position,age;

和上一个case不同的是，Extra中出现了Using filesort，因为索引的创建顺序为name，age，position，但是排序的时候age和position颠倒了位置。

EXPLAIN select * from employees WHERE name='LiLei' order by age asc, position desc;

虽然排序的字段和联合索引顺序是一样的，且order by是默认升序，这里position desc是降序，导致与索引的排序方式不同，从而产生Using filesort。Mysql8以上版本有降序索引可以支持该种查询方式。

EXPLAIN select * from employees WHERE name in('LiLei', 'zhuge') order by age, position ;

对于排序来说，多个相等条件也是范围查询。

EXPLAIN select * from employees WHERE name > 'a' order by name;

可以用覆盖索引优化

EXPLAIN select name,age,position from employees WHERE name > 'a' order by name;

filesort排序

EXPLAIN select * from employees where name='LiLei' order by position;

查看这条sql对应trace结果（只展示排序部分）：

set session optimizer_trace="enabled=on",end_markers_in_json=on; ‐‐开启trace
select * from employees where name = 'LiLei' order by position;
select * from information_schema.OPTIMIZER_TRACE;
{
"join_execution": { --sql执行阶段
"select#": 1,
"steps": [
{
"filesort_information": [
{
"direction": "asc",
"table": "`employees`",
"field": "position"
}
] /* filesort_information */,
"filesort_priority_queue_optimization": {
"usable": false,
"cause": "not applicable (no LIMIT)"
} /* filesort_priority_queue_optimization */,
"filesort_execution": [
] /* filesort_execution */,
"filesort_summary": { --文件排序信息
"rows": 1, --预计扫描行数
"examined_rows": 1, --参与排序的行
"number_of_tmp_files": 0, --使用临时文件的个数，这个值为0代表全部使用sort_buffer内存排序，否则使用磁盘文件排序
"sort_buffer_size": 200704, --排序缓存的大小
"sort_mode": "" --排序方式，这里用的单路排序
} /* filesort_summary */
}
] /* steps */
} /* join_execution */
}

修改maxlengthforsortdata=10

set max_length_for_sort_data = 10; --employees表所有字段长度总和肯定大于10字节
select * from employees where name = 'LiLei' order by position;
select * from information_schema.OPTIMIZER_TRACE;
{
"join_execution": {
"select#": 1,
"steps": [
{
"filesort_information": [
{
"direction": "asc",
"table": "`employees`",
"field": "position"
}
] /* filesort_information */,
"filesort_priority_queue_optimization": {
"usable": false,
"cause": "not applicable (no LIMIT)"
} /* filesort_priority_queue_optimization */,
"filesort_execution": [
] /* filesort_execution */,
"filesort_summary": {
"rows": 1,
"examined_rows": 1,
"number_of_tmp_files": 0,
"sort_buffer_size": 53248,
"sort_mode": "" --排序方式为双路排序
} /* filesort_summary */
}
] /* steps */
} /* join_execution */
}

对比这两个排序模式，单路排序会把所有的需要查询的字段数据都放到sortbuffer中，而双路排序只会把主键id和需要排序的字段放到sortbuffer中进行排序，然后再通过主键id 回到原表查询需要的字段数据。MySQL通过maxlengthforsortdata这个参数来控制排序，在不同场景下使用不同的排序模式，从而提升排序效率。

优化总结

Mysql支持两种方式的排序filesort和index，using index是指Mysql扫描索引本身完成排序。index效率高，filesort效率低。
order by满足两种情况会使用using index。order by语句使用索引最左前列。使用where子句和order by子句条件列组合满足索引最左前列。
尽量在索引列上完成排序，遵循索引建立（索引创建的顺序）时候的最左前缀法则。
如果order by 的条件不在索引列上，就会产生using filesort。