文章/答案/技术大牛

发布

社区首页 >问答首页 >为什么要扫描PostgreSQL查询表？

问为什么要扫描PostgreSQL查询表？
EN

Database Administration用户

提问于 2015-10-21 18:23:38

回答 1查看 81关注 0票数 0

我正在运行PostgreSQL 9.3。该数据库是所有医疗非营利机构(国家提供者编号)。我创建了以下索引：

CREATE INDEX indiv_provider_business_mailing_address_state_name_string_index
  ON individuals (provider_business_mailing_address_state_name);
CREATE INDEX indiv_provider_business_practice_location_address_state_name_string_index
  ON individuals (provider_business_practice_location_address_state_name);
CREATE INDEX indiv_upper_provider_business_mailing_address_state_name_string_index
  ON individuals (UPPER(provider_business_mailing_address_state_name));
CREATE INDEX indiv_upper_provider_business_practice_location_address_state_name_string_index
  ON individuals (UPPER(provider_business_practice_location_address_state_name));

但是，查询仍然会扫描：

Postgres Running
  SELECT npi, provider_first_name,
    provider_last_name_legal_name,
    provider_organization_name_legal_business_name,
    provider_first_line_business_mailing_address,
    provider_second_line_business_mailing_address,
    provider_business_mailing_address_city_name,
    provider_business_mailing_address_state_name,
    substr(provider_business_mailing_address_postal_code, 1, 5)
      AS provider_business_mailing_address_postal_code,
    provider_business_mailing_address_telephone_number,
    provider_business_mailing_address_fax_number,
    provider_first_line_business_practice_location_address,
    provider_second_line_business_practice_location_address,
    provider_business_practice_location_address_city_name,
    provider_business_practice_location_address_state_name,
    substr(provider_business_practice_location_address_postal_code, 1, 5)
      AS provider_business_practice_location_address_postal_code,
    provider_business_practice_location_address_telephone_number,
    provider_business_practice_location_address_fax_number
  FROM individuals
  WHERE ((((provider_business_mailing_address_state_name = UPPER(regexp_replace(?, '\s+', ' ', 'g'))))
    OR ((provider_business_practice_location_address_state_name = UPPER(regexp_replace(?, '\s+', ' ', 'g'))))))
  LIMIT 100

解释产出：

Postgres Limit  (cost=0.00..400.58 rows=100 width=141)
  ->  Seq Scan on individuals  (cost=0.00..1210010.61 rows=302063 width=141)
        Filter: ((provider_business_mailing_address_state_name = 'PA'::text) OR (provider_business_practice_location_address_state_name = 'PA'::text))

知道为什么吗？

更新

我按照changed (https://dba.stackexchange.com/a/118850/7924)的建议更改了查询，但它仍然是表扫描：

Postgres Running
  SELECT npi,
    provider_first_name,
    provider_last_name_legal_name,
    provider_organization_name_legal_business_name,
    provider_first_line_business_mailing_address,
    provider_second_line_business_mailing_address,
    provider_business_mailing_address_city_name,
    provider_business_mailing_address_state_name,
    substr(provider_business_mailing_address_postal_code, 1, 5)
      AS provider_business_mailing_address_postal_code,
    provider_business_mailing_address_telephone_number,
    provider_business_mailing_address_fax_number,
    provider_first_line_business_practice_location_address,
    provider_second_line_business_practice_location_address,
    provider_business_practice_location_address_city_name,
    provider_business_practice_location_address_state_name,
    substr(provider_business_practice_location_address_postal_code, 1, 5)
      AS provider_business_practice_location_address_postal_code,
    provider_business_practice_location_address_telephone_number,
    provider_business_practice_location_address_fax_number
  FROM individuals
  WHERE ((((provider_business_mailing_address_state_name = ?))
    OR ((provider_business_practice_location_address_state_name = ?))))
  LIMIT 100

解释产出：

Postgres Limit  (cost=0.00..400.58 rows=100 width=141)
  ->  Seq Scan on individuals  (cost=0.00..1210010.61 rows=302063 width=141)
    Filter: ((provider_business_mailing_address_state_name = 'PA'::text) OR (provider_business_practice_location_address_state_name = 'PA'::text))

postgresql

index-tuning

回答 1

Database Administration用户

发布于 2015-10-22 02:44:33

表达式索引不使用，也不可用。

您正在索引一个与您所查询的内容不匹配的表达式。表达式在参数端，例如：

UPPER(regexp_replace(?, '\s+', ' ', 'g'))

PostgreSQL是基于替代参数的常量折叠，在您已经提供的情况下，它变成了文字'PA'。然后在provider_business_mailing_address_state_name列中查找它。

如果您打算像表达式索引所暗示的那样，针对provider_business_mailing_address_state_name列的上感知、裁剪版本进行搜索，那么您也必须在SQL中使用该表达式。

OR ((UPPER(regexp_replace(provider_business_practice_location_address_state_name, '\s+', ' ', 'g')) = UPPER(regexp_replace(?, '\s+', ' ', 'g'))))))

但坦率地说，在这种情况下，我建议将表中的数据规范化，而不是依赖表达式索引。

票数 3

页面原文内容由Database Administration提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://dba.stackexchange.com/questions/118798

复制

相似问题

问为什么要扫描PostgreSQL查询表？
EN

更新

回答 1

Database Administration用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么要扫描PostgreSQL查询表？EN

更新

回答 1

Database Administration用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么要扫描PostgreSQL查询表？
EN