文章/答案/技术大牛

发布

问索引设计不正确
EN

Database Administration用户

提问于 2017-11-15 08:31:44

回答 3查看 78关注 0票数 1

我们有一个名为“反馈”的表格如下：

id            - int(10) unsigned    NO  PRI     auto_increment
vendor_Id     - int(10) unsigned    YES MUL     
created_At    - datetime CURRENT_TIMESTAMP  
updated_At    - datetime(3)         
code          - int(11) YES     
feedback      - varchar(300)    YES

CREATE TABLE `Feedback` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `vendor_Id` int(10) unsigned DEFAULT NULL,
  `created_At` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated_At` datetime(3) NOT NULL,
  `uniqueKey` varchar(255) DEFAULT NULL,
  `customerContactNumber` varchar(15) DEFAULT NULL,
  `customerName` varchar(30) DEFAULT NULL,
  `code` int(11) DEFAULT NULL,
  `feedback` varchar(300) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `feedback_unique` (`uniqueKey `),
  KEY `codeIndex` (`vendor_Id`,`updated_At`,`code`),
  KEY `feedbackIndex` (`vendor_Id`,`code`,`feedback`,`updated_At`),
  CONSTRAINT `feedback_vendor_id_foreign` FOREIGN KEY (`vendor_Id`) REFERENCES `Vendor` (`vendor_Id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

我们需要为反馈表创建索引。

因此，我们根据字段添加了3个索引，作为where子句的一部分：

 index1 (vendor_Id, updated_At);
 codeIndex (vendor_Id, updated_At, code)
 feedbackIndex(vendor_Id, code, feedback, updated_At)

现在，当我们运行以下查询时：

explain  select  'id', 'customerName', 'customerContactNumber', 'feedback', 'updated_At'
    from  Feedback
    where  vendor_Id = 1
      and  updated_At <  '2017-11-15 12:58:12.387'
      and  updated_At >= '2017-11-01 00:00:00.000'
      and  code is not null
      and  code >=0
      and  code <= 6
    order by  updated_At DESC
    limit  10

mysql输出显示：

possible_keys : index1,codeIndex,feedbackIndex
keys : feedbackIndex

我很想知道为什么explain语句显示'feedbackIndex‘作为选择的键。我们以为它会选择'codeIndex‘

还有另外两个查询：

explain  select  code, count(*) as count
    from  Feedback
    where  vendor_Id = 1
      and  updated_At <  '2017-11-15 12:58:12.387'
      and  updated_At >= '2017-11-01 00:00:00.000'
      and  code is not null
    group by  code

===>在explain语句的输出中显示'codeIndex‘作为键。这和预期的一样

explain  select  code, count(*) as count
    from  Feedback
    where  vendor_Id = 1
      and  updated_At <  '2017-11-15 12:58:12.387'
      and  updated_At >= '2017-11-01 00:00:00.000'
      and  code is not null
      and  feedback is not null
    group by  code`

===>在explain语句的输出中显示'feedbackIndex‘作为键。这和预期的一样

我不明白为什么我们的第一个查询显示feedbackIndex作为键。

mysql

index

回答 3

Database Administration用户

发布于 2017-11-15 16:59:28

在创建最优索引时，从= --在您的情况下是vendor_id开始。

然后你得到一个“范围”的裂缝-- updated_at或(在第一种情况下) code。

另外(或者)“覆盖”查询也是有益的。这是SELECT中提到的所有列都存在于INDEX中的地方。在SELECT *的例子中，这是表中的所有列，对很多人来说通常都是这样。在执行code, count(*)时，只提到3或4列，因此“覆盖”可能是实用和有益的。

PRIMARY KEY (id)隐式地包含在任何次要密钥中。

当提出这样的问题时，提供整个EXPLAIN是很方便的。讨论"key_len“以及如何告诉您它使用了多少列是有意义的。EXPLAIN FORMAT=JSON SELECT ...甚至更好，因为它是显式的。

为什么它要为第一个查询选择(vendor_Id, code, feedback, updated_At, id)？它缺少了created_at，所以它不是“覆盖”。它可能认为vendor_id (=)加code (range)略好于vendor_id, updated_At。我认为后者(又名"Index1")会更好，因为它可能能够使用ORDER BY。

index1 (vendor_Id, updated_At)是冗余的(并且可以是DROPped)，因为codeIndex (vendor_Id, updated_At, code)。

第二个查询需要(vendor_Id, updated_At, code) -- =、"range“、"covering”。查询可以在索引中执行，而不需要接触数据。

第三个查询添加了feedback；因此它需要(vendor_Id, code, feedback, updated_At)的列，但是很难知道该顺序是最好的还是(vendor_Id, updated_At, code, feedback)。

您使用的是哪个版本? 5.6、5.7和8.0都对这方面的优化做了重大更改。

顺便说一下，and code is not null在第一个查询中是多余的。

第一个查询不能在10行之后停止；这是因为WHERE中有多个“范围”。

票数 2

Database Administration用户

发布于 2017-11-20 09:46:27

与其说是回答，不如说是评论，但我觉得要发表评论有点长，所以我会把它作为回答。

根据我对MySQL的有限经验，优化器有时会做出一些奇怪的决定。以下是我做过的一个观察：

MySQL优化器在选择两个索引时使用什么策略？

简而言之，似乎优化器选择了第一个可能的索引，尽管在那里稍后声明了更好的索引。您可以尝试以不同的顺序创建索引，并查看这是否更改了它选择的索引。

票数 1

Database Administration用户

发布于 2017-11-20 08:14:32

谢谢里克的解释。我同意index1 (vendor_Id，updated_At)是多余的。我们目前正在使用AWS Aurora，它使用MySQL5.6。这不是一个答复，但由于我的评论很长，我不得不把它作为答复。

我创建了以下新索引：

codeIndex - ['vendor_Id', 'updated_At', 'code']
feedbackIndex - ['vendor_Id', 'updated_At', 'code', 'feedback']

现在，我运行了以下两个查询：

查询1：

 EXPLAIN FORMAT=JSON select  'id', 'customerName', 'customerContactNumber', 'code', 'feedback', 'updated_At'
    from  Feedback
    where  vendor_Id = 1
      and  updated_At <  '2017-11-15 12:58:12.387'
      and  updated_At >= '2017-11-01 00:00:00.000'
      and  code is not null
      and  code >=0
      and  code <= 6
    order by  updated_At DESC
    limit  10 
{
  "query_block": {
    "select_id": 1,
    "ordering_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "Feedback",
        "access_type": "ref",
        "possible_keys": [
          "codeIndex",
          "feedbackIndex"
        ],
        "key": "codeIndex",
        "used_key_parts": [
          "vendor_Id"
        ],
        "key_length": "5",
        "ref": [
          "const"
        ],
        "rows": 1,
        "filtered": 100,
        "using_index": true,
        "attached_condition": "((`dEbill`.`Feedback`.`updated_At` < '2017-11-15 12:58:12.387') and (`dEbill`.`Feedback`.`updated_At` >= '2017-11-01 00:00:00.000') and (`dEbill`.`Feedback`.`code` >= 0) and (`dEbill`.`Feedback`.`code` <= 6))"
      }
    }
  }
}

DB选择codeIndex，这似乎是个好主意。因为，我在中添加了字段，我是否应该将这些字段包括在索引中作为覆盖？我认为它只会使索引表太大，不适合我使用的字段数量。

查询2：

EXPLAIN FORMAT=JSON  select  code, count(*) as count
    from  Feedback
    where  vendor_Id = 1
      and  updated_At <  '2017-11-15 12:58:12.387'
      and  updated_At >= '2017-11-01 00:00:00.000'
      and  code is not null
      and  feedback is not null
    group by  code;



{
  "query_block": {
    "select_id": 1,
    "grouping_operation": {
      "using_temporary_table": true,
      "using_filesort": true,
      "table": {
        "table_name": "Feedback",
        "access_type": "ref",
        "possible_keys": [
          "codeIndex",
          "feedbackIndex"
        ],
        "key": "codeIndex",
        "used_key_parts": [
          "vendor_Id"
        ],
        "key_length": "5",
        "ref": [
          "const"
        ],
        "rows": 1,
        "filtered": 100,
        "index_condition": "((`dEbill`.`Feedback`.`updated_At` < '2017-11-15 12:58:12.387') and (`dEbill`.`Feedback`.`updated_At` >= '2017-11-01 00:00:00.000') and (`dEbill`.`Feedback`.`code` is not null))",
        "attached_condition": "((`dEbill`.`Feedback`.`vendor_Id` <=> 1) and (`dEbill`.`Feedback`.`feedback` is not null))"
      }
    }
  }
}

在这种情况下，DB选择codeIndex ->我认为feedbackIndex是一个更好的选择。你的想法？即使我将feedbackIndex定义为卖主_Id，代码，反馈，更新_在‘’，输出也保持不变。而且，此时所有的表都是空的。

票数 0

页面原文内容由Database Administration提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://dba.stackexchange.com/questions/190958

复制

相似问题

问索引设计不正确
EN

回答 3

Database Administration用户

Database Administration用户

Database Administration用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问索引设计不正确EN

回答 3

Database Administration用户

Database Administration用户

Database Administration用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问索引设计不正确
EN