首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Postgres没有对更高的偏移量使用不同的查询计划

Postgres没有对更高的偏移量使用不同的查询计划
EN

Stack Overflow用户
提问于 2017-01-25 20:19:12
回答 1查看 27关注 0票数 0

我有一个postgres查询

代码语言:javascript
复制
explain SELECT  "facilities".* FROM "facilities" INNER JOIN 
resource_indices ON resource_indices.resource_id = facilities.uuid WHERE 
(client_id IS NULL OR (client_tag=NULL AND client_id=7)) 
AND (ARRAY['country:india']::varchar[] && resource_indices.tags) 
AND "facilities"."is_listed" = 't'  
ORDER BY resource_indices.name LIMIT 11 OFFSET 100;

观察偏移量。当偏移量小于200时,它使用索引,工作正常。其查询计划如下

代码语言:javascript
复制
             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=23416.57..24704.45 rows=11 width=1457) (actual time=41.951..43.035 rows=11 loops=1)
   ->  Nested Loop  (cost=0.71..213202.15 rows=1821 width=1457) (actual time=2.107..43.007 rows=211 loops=1)
         ->  Index Scan using index_resource_indices_on_name on resource_indices  (cost=0.42..190226.95 rows=12460 width=28) (actual time=2.096..40.790 rows=408 loops=1)
               Filter: ('{country:india}'::character varying[] && tags)
               Rows Removed by Filter: 4495
         ->  Index Scan using index_facilities_on_uuid on facilities  (cost=0.29..1.83 rows=1 width=1445) (actual time=0.005..0.005 rows=1 loops=408)
               Index Cond: (uuid = resource_indices.resource_id)
               Filter: ((client_id IS NULL) AND is_listed)
 Planning time: 1.259 ms
 Execution time: 43.121 ms
(10 rows)

比方说,增加偏移量为400会开始使用散列连接,并且会产生更差的性能。增加偏移量会带来更差的性能。

代码语言:javascript
复制
         QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=34508.62..34508.65 rows=11 width=1457) (actual time=136.288..136.291 rows=11 loops=1)
   ->  Sort  (cost=34507.62..34512.18 rows=1821 width=1457) (actual time=136.224..136.268 rows=411 loops=1)
         Sort Key: resource_indices.name
         Sort Method: top-N heapsort  Memory: 638kB
         ->  Hash Join  (cost=29104.96..34419.46 rows=1821 width=1457) (actual time=23.885..95.099 rows=6518 loops=1)
               Hash Cond: (facilities.uuid = resource_indices.resource_id)
               ->  Seq Scan on facilities  (cost=0.00..4958.39 rows=33790 width=1445) (actual time=0.010..48.732 rows=33711 loops=1)
                     Filter: ((client_id IS NULL) AND is_listed)
                     Rows Removed by Filter: 848
               ->  Hash  (cost=28949.21..28949.21 rows=12460 width=28) (actual time=23.311..23.311 rows=12601 loops=1)
                     Buckets: 2048  Batches: 1  Memory Usage: 814kB
                     ->  Bitmap Heap Scan on resource_indices  (cost=1048.56..28949.21 rows=12460 width=28) (actual time=9.369..18.710 rows=12601 loops=1)
                           Recheck Cond: ('{country:india}'::character varying[] && tags)
                           Heap Blocks: exact=7334
                           ->  Bitmap Index Scan on index_resource_indices_on_tags  (cost=0.00..1045.45 rows=12460 width=0) (actual time=7.680..7.680 rows=13889 loops=1)
                                 Index Cond: ('{country:india}'::character varying[] && tags)
 Planning time: 1.408 ms
 Execution time: 136.465 ms
(18 rows)

我该如何解决这个问题?谢谢

EN

回答 1

Stack Overflow用户

发布于 2017-01-25 21:28:35

这是不可避免的,因为没有其他方法来实现LIMIT 10 OFFSET 10000,只能获取前10010行,然后丢弃除最后10行之外的所有行。随着偏移量的增加,这必然会导致性能越来越差。

PostgreSQL切换到不同的计划,因为它必须检索更多的结果行,并且当需要更多的结果行时,快速检索前几行并且通常涉及嵌套循环联接的“快速启动”计划的性能将比其他计划差。

OFFSET是邪恶的,在大多数情况下你应该避免使用它。阅读what Markus Winand has to say about this topic,特别是如何在不使用OFFSET的情况下对结果集进行分页。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/41851486

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档