文章/答案/技术大牛

发布

社区首页 >问答首页 >为什么限制2查询工作，但限制1总是超时？

问为什么限制2查询工作，但限制1总是超时？
EN

Stack Overflow用户

提问于 2022-02-07 22:36:29

回答 1查看 97关注 0票数 1

我使用的是最近协议的公共Postgres：https://github.com/near/near-indexer-for-explorer#shared-public-access

postgres://public_readonly:nearprotocol@mainnet.db.explorer.indexer.near.dev/mainnet_explorer

SELECT "public"."receipts"."receipt_id",
    "public"."receipts"."included_in_block_hash",
    "public"."receipts"."included_in_chunk_hash",
    "public"."receipts"."index_in_chunk",
    "public"."receipts"."included_in_block_timestamp",
    "public"."receipts"."predecessor_account_id",
    "public"."receipts"."receiver_account_id",
    "public"."receipts"."receipt_kind",
    "public"."receipts"."originated_from_transaction_hash"
FROM "public"."receipts"
WHERE ("public"."receipts"."receipt_id") IN
        (SELECT "t0"."receipt_id"
            FROM "public"."receipts" AS "t0"
            INNER JOIN "public"."action_receipts" AS "j0" ON ("j0"."receipt_id") = ("t0"."receipt_id")
            WHERE ("j0"."signer_account_id" = 'ryancwalsh.near'
                                        AND "t0"."receipt_id" IS NOT NULL))
ORDER BY "public"."receipts"."included_in_block_timestamp" DESC
LIMIT 1
OFFSET 0

结果总是：

ERROR:  canceling statement due to statement timeout
SQL state: 57014

但是，如果我将其更改为限制为2，则查询将在1秒内运行。

怎么会是这样的呢？这是否意味着数据库设置不好？还是我做错什么了？

这里的查询是通过Prisma生成的。findFirst总是超时，因此我想我可能需要将其更改为findMany作为解决办法。

sql

postgresql

sql-execution-plan

postgresql-performance

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-02-08 02:33:02

您的查询可以简化为/optimized：

SELECT r.receipt_id
     , r.included_in_block_hash
     , r.included_in_chunk_hash
     , r.index_in_chunk
     , r.included_in_block_timestamp
     , r.predecessor_account_id
     , r.receiver_account_id
     , r.receipt_kind
     , r.originated_from_transaction_hash
FROM   public.receipts r
WHERE  EXISTS (
   SELECT FROM public.action_receipts j
   WHERE  j.receipt_id = r.receipt_id
   AND    j.signer_account_id = 'ryancwalsh.near'
   )
ORDER  BY r.included_in_block_timestamp DESC
LIMIT  1;

然而，这只会触及你根本问题的表面。

与Kirk已经注释过的一样，Postgres为LIMIT 1选择了一个不同的查询计划，显然不知道在表中只有90行和signer_account_id = 'ryancwalsh.near'，而这两个涉及的表都有超过2.2亿行的，显然在稳步增长。

更改为LIMIT 2使不同的查询计划看起来更有利，因此在性能上出现了明显的差异。(因此，查询规划器具有这样的一般想法，即过滤器具有很强的选择性，只是对LIMIT 1的拐角情况来说不够接近。)

你应该提到cardinalities让我们走上正确的轨道。

知道我们的过滤器是如此的选择性，我们可以强制一个更有利的查询计划与一个不同的查询：

WITH j AS (
   SELECT receipt_id  -- is PK!
   FROM   public.action_receipts
   WHERE  signer_account_id = 'ryancwalsh.near'
   )
SELECT r.receipt_id
     , r.included_in_block_hash
     , r.included_in_chunk_hash
     , r.index_in_chunk
     , r.included_in_block_timestamp
     , r.predecessor_account_id
     , r.receiver_account_id
     , r.receipt_kind
     , r.originated_from_transaction_hash
FROM   j
JOIN   public.receipts r USING (receipt_id)
ORDER  BY r.included_in_block_timestamp DESC
LIMIT  1;

这对LIMIT 1使用相同的查询计划，或者在我的测试中在不到2ms内完成：

                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=134904.89..134904.89 rows=1 width=223) (actual time=1.750..1.754 rows=1 loops=1)
   CTE j
     ->  Bitmap Heap Scan on action_receipts  (cost=319.46..41564.59 rows=10696 width=44) (actual time=0.058..0.179 rows=90 loops=1)
           Recheck Cond: (signer_account_id = 'ryancwalsh.near'::text)
           Heap Blocks: exact=73
           ->  Bitmap Index Scan on action_receipt_signer_account_id_idx  (cost=0.00..316.79 rows=10696 width=0) (actual time=0.043..0.043 rows=90 loops=1)
                 Index Cond: (signer_account_id = 'ryancwalsh.near'::text)
   ->  Sort  (cost=93340.30..93367.04 rows=10696 width=223) (actual time=1.749..1.750 rows=1 loops=1)
         Sort Key: r.included_in_block_timestamp DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Nested Loop  (cost=0.70..93286.82 rows=10696 width=223) (actual time=0.089..1.705 rows=90 loops=1)
               ->  CTE Scan on j  (cost=0.00..213.92 rows=10696 width=32) (actual time=0.060..0.221 rows=90 loops=1)
               ->  Index Scan using receipts_pkey on receipts r  (cost=0.70..8.70 rows=1 width=223) (actual time=0.016..0.016 rows=1 loops=90)
                     Index Cond: (receipt_id = j.receipt_id)
 Planning Time: 0.281 ms
 Execution Time: 1.801 ms

重点是首先在CTE中执行具有巨大选择性的查询。那么Postgres不会试图在(included_in_block_timestamp)上遍历索引，前提是它很快就会找到匹配的行。(事实并非如此。)

手边的DB运行Postgres 11，其中CTE始终是优化障碍。在中，Postgres 12或更高版本将AS MATERIALIZED添加到CTE以保证同样的效果。

或者您可以在以下任何版本中使用“偏移量0 hack"：

SELECT r.receipt_id
     , r.included_in_block_hash
     , r.included_in_chunk_hash
     , r.index_in_chunk
     , r.included_in_block_timestamp
     , r.predecessor_account_id
     , r.receiver_account_id
     , r.receipt_kind
     , r.originated_from_transaction_hash
FROM  (
   SELECT receipt_id  -- is PK!
   FROM   public.action_receipts
   WHERE  signer_account_id = 'ryancwalsh.near'
   OFFSET 0  -- !
   ) j
JOIN   public.receipts r USING (receipt_id)
ORDER  BY r.included_in_block_timestamp DESC
LIMIT  1;

防止子查询的“内联”达到相同效果。以< 2ms结束。

请参见：

如何防止PostgreSQL重写外部联接查询？

“修复”数据库？

正确的修复取决于完整的图片。根本的问题是Postgres高估了表action_receipts中符合条件的行数。MCV列表(most common value)无法跟上2.2亿行(而且还在增长)。这很可能不仅仅是ANALYZE落后。(虽然可能是：autovacuum没有正确配置？新手犯的错误？)根据action_receipts.signer_account_id和访问模式中的实际基数(数据分布)，您可以做各种事情来“修复”它。有两种选择：

1.增加n_distinct和STATISTICS

如果action_receipts.signer_account_id中的大多数值同样罕见(基数较高)，请考虑为该列设置一个非常大的n_distinct值。并将其与同一列的适度增加的STATISTICS目标结合起来，以对抗其他方向的错误(_under_estimating，公共值的限定行数)。在这里阅读两个答案：

Postgres有时使用劣等索引来表示IN (.)按b限制N的顺序

和：

PostgreSQL 9.6中非常糟糕的查询计划

2.用部分指数进行局部修复

如果 action_receipts.signer_account_id = 'ryancwalsh.near'的特殊之处在于它比其他人更经常地被查询，那么请考虑为它设置一个小的部分索引，以修复这种情况。比如：

CREATE INDEX ON action_receipts (receipt_id)
WHERE signer_account_id = 'ryancwalsh.near';

票数 5

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71026316

复制

相似问题

问为什么限制2查询工作，但限制1总是超时？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么限制2查询工作，但限制1总是超时？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问为什么限制2查询工作，但限制1总是超时？
EN