首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >没有全表扫描的优先级/级联日期时间

没有全表扫描的优先级/级联日期时间
EN

Stack Overflow用户
提问于 2017-12-08 05:40:28
回答 3查看 54关注 0票数 3
代码语言:javascript
复制
SELECT IF(priority_date, priority_date, created_at) as created_at
FROM table
WHERE IF(priority_date , priority_date , created_at) 
    BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59';

执行此查询的最佳方法是什么,性能方面?

我有一个相当大的桌子,有两次约会。created_atpriority_date.

priority_date并不总是存在的,但是如果它存在,它应该是被查询的内容,否则它会回到created_atcreated_at总是在创建行时生成的。上面的查询会导致(几乎)全表扫描。

初始查询的解释计划:

代码语言:javascript
复制
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+
| id   | select_type | table           | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+
|    1 | SIMPLE      | table           | ALL  | NULL          | NULL | NULL    | NULL | 444877 | Using where |
+------+-------------+-----------------+------+---------------+------+---------+------+--------+-------------+

我还应该指出,priority_datecreated_at可能不一定都在所讨论的单个行的时间范围内。所以做这样的事:

代码语言:javascript
复制
WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'

如果priority_date2017-10-04 23:10:43created_at2017-10-10 01:23:45,则可能产生不良结果

上述表的当前行:582739

WHERE priority_date BETWEEN...计数: 3908

WHERE created_at BETWEEN...计数: 3437

示例说明其中一个列的查询位置:

代码语言:javascript
复制
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+
| id   | select_type | table           | type  | possible_keys                    | key                              | key_len | ref  | rows | Extra                 |
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+
|    1 | SIMPLE      | table           | range | table_created_at_index           | table_created_at_index           | 5       | NULL | 3436 | Using index condition |
+------+-------------+-----------------+-------+----------------------------------+----------------------------------+---------+------+------+-----------------------+

显然,IF并不是最有效的。列被编入索引,各个行的解释与解释计划中行的计数相匹配。如何在不造成严重性能损失的情况下利用优先级/回退查询?

编辑

我所能想到的最好的感觉(但是哇,那是冗长的、复制的/粘贴的感觉)

代码语言:javascript
复制
SELECT IF(priority_date, priority_date, created_at) as created_at, priority_date
FROM table 
WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
    OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
HAVING ((priority_date AND priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59')
    OR created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59');

以及它的解释计划:

代码语言:javascript
复制
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
| id   | select_type | table           | type        | possible_keys                                                         | key                                                                   | key_len | ref  | rows | Extra                                                                                                |
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
|    1 | SIMPLE      | table           | index_merge | table_priority_date_index,table_created_at_index                      | table_priority_date_index,table_created_at_index                      | 6,5     | NULL | 7343 | Using sort_union(table_priority_date_index,table_created_at_index); Using where                      |
+------+-------------+-----------------+-------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+---------+------+------+------------------------------------------------------------------------------------------------------+
EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2017-12-08 16:13:37

首先需要一个复合索引(priority_date,created_at),然后可以使用如下查询:

代码语言:javascript
复制
SELECT IF(priority_date, priority_date, created_at) as created_at, priority_date
FROM table 
WHERE priority_date BETWEEN '2017-10-10' AND '2017-10-10 23:59:59'
  OR (priority_date IS NULL AND created_at BETWEEN '2017-10-10' AND '2017-10-10 23:59:59');

把priority_date放在复合指数的第一位会带来很大的不同。不需要工会。

用2000的结果解释400 K行的结果:

代码语言:javascript
复制
Extra: Using where; Using index
key: priority_created_compound
rows: 2000
票数 2
EN

Stack Overflow用户

发布于 2017-12-08 06:14:25

代码语言:javascript
复制
SELECT priority_date as created_at
FROM table
WHERE priority_date BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'

UNION ALL

SELECT created_at
FROM table
WHERE created_at BETWEEN '2017-10-10 00:00:00' AND '2017-10-10 23:59:59'
 AND priority_date IS NULL;

对于这个查询的前半部分,您需要一个以priority_date开头的索引,而在下半年则需要一个(created_at, priority_date)索引。

上半部分自然不匹配priority_date为NULL的任何行。

下半部分将在created_at上执行范围条件,然后在匹配行的子集中进一步测试priority_date是否为NULL。这可以通过索引条件下推来完成。

票数 2
EN

Stack Overflow用户

发布于 2017-12-09 18:16:17

代码语言:javascript
复制
( SELECT  priority_date AS created_at
    FROM  table
    WHERE  priority_date >= '2017-10-10'
      AND  priority_date <  '2017-10-10' + INTERVAL 1 DAY )
UNION  DISTINCT 
( SELECT  created_at
    FROM  table
    WHERE  created_at >= '2017-10-10'
      AND  created_at <  '2017-10-10' + INTERVAL 1 DAY
      AND  priority_date IS NULL )

使用

代码语言:javascript
复制
INDEX(priority_date, created_at)  -- in this order

备注:

  • 这种方法可以更好地处理除BETWEEN以外的日期范围,并避免计算闰日等(这并没有性能上的差异)。
  • 对于每个子查询,只有一个索引是“覆盖”和最优的。不应需要比较方案。
  • 我在UNION上选择了UNION --虽然比ALL慢,但它可能更符合你的应用程序的喜好。如果不能使用dups,或者如果dups没有问题,则切换到ALL
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/47708384

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档