我有一个表transfer_event,需要从中查找一组令牌的最新n个条目;n和令牌都是由用户提供的。我用的是Postgres 9.6。
问题是,如果我使用一个查询来获得一个令牌列表的前n个条目,那么这个查询要比单独获取每个令牌的前n个条目的所有查询慢得多(55s对10s),而且我不知道对额外的45s做了什么工作。
对于组合查询,我决定使用横向联接(使用窗口函数的方法甚至更慢):
-- There's 50 tokens to look for
with tokens(t) as (values('0x..',..))
select *
from tokens t
cross join lateral (
select *
from transfer_event c
where c.token = t.t
and c.block_range @> 2147483647
order by c.timestamp desc
limit 5) x以下是查询计划:
Nested Loop (cost=1.19..855.09 rows=250 width=250) (actual time=0.120..56313.953 rows=230 loops=1)
Buffers: shared hit=6567497 read=3119272
CTE tokens
-> Values Scan on "*VALUES*" (cost=0.00..0.62 rows=50 width=32) (actual time=0.003..0.072 rows=50 loops=1)
-> CTE Scan on tokens t (cost=0.00..1.00 rows=50 width=32) (actual time=0.006..0.168 rows=50 loops=1)
-> Limit (cost=0.56..16.97 rows=5 width=250) (actual time=1119.814..1126.273 rows=5 loops=50)
Buffers: shared hit=6567497 read=3119272
-> Index Scan Backward using attr_6_7_transfer_event_timestamp on transfer_event c (cost=0.56..1432759.34 rows=436637 width=250) (actual time=1119.814..1126.270 rows=5 loops=50)
Filter: ((block_range @> 2147483646) AND (token = t.t))
Rows Removed by Filter: 1875532
Buffers: shared hit=6567497 read=3119272
Planning time: 3.309 ms
Execution time: 56314.050 ms
(13 rows)这个查询需要55s;如果我运行单独的查询,每个令牌一个,遍历所有查询只需要10s;我的理解是,上面的查询本质上是我手动执行的:遍历每个令牌并执行。
select * from transfer_event c
where c.token = '$tk'
and c.block_range @> 2147483646
order by timestamp desc limit 5一个示例令牌的查询计划:
Limit (cost=0.56..315.27 rows=5 width=250) (actual time=8.664..10.816 rows=5 loops=1)
-> Index Scan Backward using attr_6_7_transfer_event_timestamp on transfer_event c (cost=0.56..1432759.34 rows=22763 width=250) (actual time=8.663..10.813 rows=5 loops=1)
Filter: ((block_range @> 2147483646) AND (token = '0xec67005c4e498ec7f55e092bd1d35cbc47c91892'::text))
Rows Removed by Filter: 18562
Planning time: 1.437 ms
Execution time: 10.871 ms
(6 rows)基于答复的
感谢贾尼和劳伦兹·阿尔贝对这个问题的评论;根据评论,更多信息如下:
时间是带着温暖的缓存进行的(在我发布的计划之前多次运行查询)
对于令牌逐个查找,Postgres确实改变了查询计划.上面的一个是最常用的,但是还有另外两个计划用于几个令牌。
用于三个令牌:
Limit (cost=5284.08..5284.09 rows=5 width=250) (actual time=5.123..5.125 rows=5 loops=1)
Buffers: shared hit=1766
-> Sort (cost=5284.08..5287.43 rows=1339 width=250) (actual time=5.122..5.122 rows=5 loops=1)
Sort Key: "timestamp" DESC
Sort Method: top-N heapsort Memory: 27kB
Buffers: shared hit=1766
-> Bitmap Heap Scan on transfer_event c (cost=62.94..5261.84 rows=1339 width=250) (actual time=0.671..3.990 rows=2184 loops=1)
Recheck Cond: (token = '0x09617f6fd6cf8a71278ec86e23bbab29c04353a7'::text)
Filter: (block_range @> 2147483646)
Heap Blocks: exact=1740
Buffers: shared hit=1763
-> Bitmap Index Scan on attr_6_1_transfer_event_token (cost=0.00..62.61 rows=1339 width=0) (actual time=0.405..0.405 rows=2184 loops=1)
Index Cond: (token = '0x09617f6fd6cf8a71278ec86e23bbab29c04353a7'::text)
Buffers: shared hit=23
Planning time: 1.265 ms
Execution time: 5.222 ms
(16 rows)用于4种令牌:
Limit (cost=8.59..8.60 rows=1 width=250) (actual time=0.058..0.058 rows=0 loops=1)
Buffers: shared hit=7
-> Sort (cost=8.59..8.60 rows=1 width=250) (actual time=0.057..0.057 rows=0 loops=1)
Sort Key: "timestamp" DESC
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=7
-> Index Scan using attr_6_1_transfer_event_token on transfer_event c (cost=0.56..8.58 rows=1 width=250) (actual time=0.028..0.028 rows=0 loops=1)
Index Cond: (token = '0x0cbe2df57ca9191b64a7af3baa3f946fa7df2f25'::text)
Filter: (block_range @> 2147483646)
Buffers: shared hit=4
Planning time: 1.404 ms
Execution time: 0.107 ms
(12 rows)发布于 2020-01-10 08:04:58
您可以看到第二个查询具有Rows Removed by Filter: 18562,而平均而言,第一个查询具有Rows Removed by Filter: 1875532。
如果您运行50个单独的查询,优化器将分别对每个查询进行优化,在只有少数行满足筛选条件的情况下,它可能会选择不同的、更好的执行计划。带有横向联接的查询没有这个选项:它必须选择一个用于所有50个循环的计划,这个计划对一些值来说是不好的。
您可以尝试的一件事是对transfer_event.block_range和transfer_event.token上的条件进行索引,并确保不能使用transfer_event.timestamp上的索引,例如使用ORDER BY c.timestamp + INTERVAL '0 days' DESC。我不能说这是否会改善表现,但值得一试。
https://dba.stackexchange.com/questions/257069
复制相似问题