我们有两个表,类似于一个简单的标记-记录结构,如下所示(实际上它要复杂得多,但这就是问题的本质):
tag (A.a) | recordId (A.b)
1 | 1
2 | 1
2 | 2
3 | 2
....和
recordId (B.b) | recordData (B.c)
1 | 123
2 | 666
3 | 1246问题是获取具有特定标记的有序记录。最明显的方法是对(PK)(A.a,A.b),(A.b),(PK)(B.b),(B.b,B.c)进行简单的连接和索引,如下所示:
select A.a, A.b, B.c from A join B on A.b = B.b where a = 44 order by c;但是,这会给文件排序带来令人不快的结果:
+----+-------------+-------+------+---------------+---------+---------+-----------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+---------+---------+-----------+------+----------------------------------------------+
| 1 | SIMPLE | A | ref | PRIMARY,b | PRIMARY | 4 | const | 94 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | B | ref | PRIMARY,b | b | 4 | booli.A.b | 1 | Using index |
+----+-------------+-------+------+---------------+---------+---------+-----------+------+----------------------------------------------+使用巨大且极其冗余的“物化视图”,我们可以获得相当不错的性能,但这是以使业务逻辑复杂化为代价的,这是我们想要避免的,特别是因为A和B表已经是MV:s (并且是其他查询所需要的,实际上是使用UNION的相同查询)。
create temporary table C engine=innodb as (select A.a, A.b, B.c from A join B on A.b = B.b);
explain select a, b, c from C where a = 44 order by c;使情况更加复杂的是,我们在B表上有条件条件,例如范围过滤器。
select A.a, A.b, B.c from A join B on A.b = B.b where a = 44 AND B.c > 678 order by c;但我们有信心,如果文件排序问题消失,我们可以处理这个问题。
有人知道为什么上面代码块3中的简单连接不会使用索引进行排序吗?我们是否可以在不创建新MV的情况下以某种方式绕过这个问题?
下面是我们用于测试的完整SQL清单。
DROP TABLE IF EXISTS A;
DROP TABLE IF EXISTS B;
DROP TABLE IF EXISTS C;
CREATE TEMPORARY TABLE A (a INT NOT NULL, b INT NOT NULL, PRIMARY KEY(a, b), INDEX idx_A_b (b)) ENGINE=INNODB;
CREATE TEMPORARY TABLE B (b INT NOT NULL, c INT NOT NULL, d VARCHAR(5000) NOT NULL DEFAULT '', PRIMARY KEY(b), INDEX idx_B_c (c), INDEX idx_B_b (b, c)) ENGINE=INNODB;
DELIMITER $$
CREATE PROCEDURE prc_filler(cnt INT)
BEGIN
DECLARE _cnt INT;
SET _cnt = 1;
WHILE _cnt <= cnt DO
INSERT IGNORE INTO A SELECT RAND()*100, RAND()*10000;
INSERT IGNORE INTO B SELECT RAND()*10000, RAND()*1000, '';
SET _cnt = _cnt + 1;
END WHILE;
END
$$
DELIMITER ;
START TRANSACTION;
CALL prc_filler(100000);
COMMIT;
DROP PROCEDURE prc_filler;
CREATE TEMPORARY TABLE C ENGINE=INNODB AS (SELECT A.a, A.b, B.c FROM A JOIN B ON A.b = B.b);
ALTER TABLE C ADD (PRIMARY KEY(a, b), INDEX idx_C_a_c (a, c));
EXPLAIN EXTENDED SELECT A.a, A.b, B.c FROM A JOIN B ON A.b = B.b WHERE A.a = 44;
EXPLAIN EXTENDED SELECT A.a, A.b, B.c FROM A JOIN B ON A.b = B.b WHERE 1 ORDER BY B.c;
EXPLAIN EXTENDED SELECT A.a, A.b, B.c FROM A JOIN B ON A.b = B.b where A.a = 44 ORDER BY B.c;
EXPLAIN EXTENDED SELECT a, b, c FROM C WHERE a = 44 ORDER BY c;
-- Added after Quassnois comments
EXPLAIN EXTENDED SELECT A.a, A.b, B.c FROM B FORCE INDEX (idx_B_c) JOIN A ON A.b = B.b WHERE A.a = 44 ORDER BY B.c;
EXPLAIN EXTENDED SELECT A.a, A.b, B.c FROM A JOIN B ON A.b = B.b WHERE A.a = 44 ORDER BY B.c LIMIT 10;
EXPLAIN EXTENDED SELECT A.a, A.b, B.c FROM B FORCE INDEX (idx_B_c) JOIN A ON A.b = B.b WHERE A.a = 44 ORDER BY B.c LIMIT 10;发布于 2009-08-04 13:50:20
当我尝试使用您的脚本重现此查询时:
SELECT A.a, A.b, B.c
FROM A
JOIN B
ON A.b = B.b
WHERE a = 44
ORDER BY
c,它(立即)在0.0043 seconds中完成,返回930行,并生成以下计划:
1, 'SIMPLE', 'A', 'ref', 'PRIMARY', 'PRIMARY', '4', 'const', 1610, 'Using index; Using temporary; Using filesort'
1, 'SIMPLE', 'B', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.A.b', 1, ''对于这样的查询来说,这是非常有效的。
对于这样的查询,不能同时使用单个索引进行筛选和排序。
请参阅我的博客中的这篇文章以获得更详细的解释:
如果您希望查询返回的记录很少,则应该使用A上的索引进行过滤,然后使用文件排序进行排序(就像上面的查询一样)。
如果您希望它返回许多记录(并对它们进行索引),则需要使用LIMIT进行排序,然后进行过滤:
CREATE INDEX ix_a_b ON a (b);
CREATE INDEX ix_b_c ON b (c)
SELECT *
FROM B FORCE INDEX (ix_b_c)
JOIN A
ON A.b = B.b
ORDER BY
b.c
LIMIT 10;
1, 'SIMPLE', 'B', 'index', '', 'ix_b_c', '4', '', 2, 'Using index'
1, 'SIMPLE', 'A', 'ref', 'ix_a_b', 'ix_a_b', '4', 'test.B.b', 4, 'Using index'发布于 2009-08-04 14:15:34
select A.a, A.b, B.c from A join B on A.b = B.b where a = 44 order by c;
如果为列添加别名,会有帮助吗?示例:
SELECT
T1.a AS colA,
T2.b AS colB,
T2.c AS colC
FROM A AS T1
JOIN B AS T2
ON (T1.b = T2.b)
WHERE
T1.a = 44
ORDER BY colC;我所做的唯一更改是:
我将连接条件放在parenthesis
我知道您的实际数据比较复杂,但我假设您提供了一个简单版本的查询,因为问题就在这个简单的级别上。
https://stackoverflow.com/questions/1227574
复制相似问题