PostgreSQL Bitmap scan 和 Index scan 优化一个SQL ，结果非常不一样！

AustinDatabases

发布于 2026-03-12 19:27:53

1250

文章被收录于专栏：AustinDatabasesAustinDatabases

最近发生一件事情，之前我没有细想，这个基于POSTGRESQL的数据库优化问题，就是关于建立了索引，在高并发的问题。

工作比较忙，事情比较多其实很多事情就无暇顾及，这点还是得说，给更多的时间和人力在优化数据库上，是可以降低成本的，但这样的方式因为不明显，或者并不被注意到，而无人理会。

这就是数据库添加了索引和数据库SQL优化是两个概念，Bitmap Index Scan 和 index scan在高并发的情况下，那个更好的问题。

我们做一个练习先来把这个问题复现一下。

test=# 
test=# DROP TABLE IF EXISTS users CASCADE;
NOTICE:  table "users" does not exist, skipping
   bigserial PRIMARY KEY,
    user_id    bigint NOT NULL,
    status     text NOT NULL,
    created_at timestamptz NOT NULL,
    amount     numeric
);DROP TABLE
test=# 
test=# CREATE TABLE users (
test(#     user_id bigint PRIMARY KEY,
test(#     user_name text
test(# );
CREATE TABLE
test=# 
test=# INSERT INTO users
test-# SELECT id, 'user_' || id
test-# FROM generate_series(1, 1000) id;
INSERT 0 1000
test=# 
test=# 
test=# DROP TABLE IF EXISTS orders;
NOTICE:  table "orders" does not exist, skipping
DROP TABLE
test=# 
test=# CREATE TABLE orders (
test(#     order_id   bigserial PRIMARY KEY,
test(#     user_id    bigint NOT NULL,
test(#     status     text NOT NULL,
test(#     created_at timestamptz NOT NULL,
test(#     amount     numeric
test(# );
CREATE TABLE
test=# 
test=# 
test=# 
test=# INSERT INTO orders (user_id, status, created_at, amount)
test-# SELECT
test-#     (random() * 999 + 1)::int,
test-#     CASE WHEN random() < 0.7 THEN 'PAID' ELSE 'NEW' END,
test-#     now() - (random() * interval '30 days'),
test-#     random() * 1000
test-# FROM generate_series(1, 1000000);
INSERT 0 1000000
test=# 
test=# 
test=# \timing
Timing is on.
test=# EXPLAIN (ANALYZE, BUFFERS)
test-# SELECT
test-#     u.user_name,
test-#     count(*)
test-# FROM
test-#     orders o
test-# JOIN users u ON o.user_id = u.user_id
test-# WHERE
test-#     o.user_id = 42
test-#     AND o.status = 'PAID'
test-# GROUP BY u.user_name;
                                                                         QUERY PLAN                                                       
                  
------------------------------------------------------------------------------------------------------------------------------------------
------------------
 Finalize GroupAggregate  (cost=16619.87..16622.31 rows=1 width=16) (actual time=27.053..29.310 rows=1.00 loops=1)
   Group Key: u.user_name
   Buffers: shared hit=9382
   ->  Gather Merge  (cost=16619.87..16622.29 rows=2 width=16) (actual time=27.006..29.271 rows=3.00 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         Buffers: shared hit=9382
         ->  Partial GroupAggregate  (cost=15619.84..15622.03 rows=1 width=16) (actual time=22.965..23.018 rows=1.00 loops=3)
               Group Key: u.user_name
               Buffers: shared hit=9382
               ->  Sort  (cost=15619.84..15620.57 rows=291 width=8) (actual time=20.692..21.811 rows=235.67 loops=3)
                     Sort Key: u.user_name
                     Sort Method: quicksort  Memory: 25kB
                     Buffers: shared hit=9382
                     Worker 0:  Sort Method: quicksort  Memory: 25kB
                     Worker 1:  Sort Method: quicksort  Memory: 25kB
                     ->  Nested Loop  (cost=0.28..15607.93 rows=291 width=8) (actual time=0.116..19.546 rows=235.67 loops=3)
                           Buffers: shared hit=9366
                           ->  Parallel Seq Scan on orders o  (cost=0.00..15596.00 rows=291 width=8) (actual time=0.041..12.387 rows=235.6
7 loops=3)
                                 Filter: ((user_id = 42) AND (status = 'PAID'::text))
                                 Rows Removed by Filter: 333098
                                 Buffers: shared hit=9346
                           ->  Materialize  (cost=0.28..8.30 rows=1 width=16) (actual time=0.005..0.010 rows=1.00 loops=707)
                                 Storage: Memory  Maximum Storage: 17kB
                                 Buffers: shared hit=20
                                 ->  Index Scan using users_pkey on users u  (cost=0.28..8.29 rows=1 width=16) (actual time=0.027..0.037 r
ows=1.00 loops=3)
                                       Index Cond: (user_id = 42)
                                       Index Searches: 3
                                       Buffers: shared hit=20
 Planning:
   Buffers: shared hit=60
 Planning Time: 0.257 ms
 Execution Time: 29.372 ms
(33 rows)

Time: 30.414 ms
test=# 
test=# 
test=# 
test=# 
test=# CREATE INDEX idx_orders_user
test-# ON orders(user_id);
CREATE INDEX
Time: 251.459 ms
test=# 
test=# CREATE INDEX idx_orders_status
test-# ON orders(status);
CREATE INDEX
Time: 302.121 ms
test=# 
test=# EXPLAIN (ANALYZE, BUFFERS)
SELECT             
    u.user_name,
    count(*)
FROM
    orders o
JOIN users u ON o.user_id = u.user_id
WHERE
    o.user_id = 42
    AND o.status = 'PAID'
GROUP BY u.user_name;
                                                                 QUERY PLAN                                                               

------------------------------------------------------------------------------------------------------------------------------------------
--
 HashAggregate  (cost=2929.47..2929.48 rows=1 width=16) (actual time=20.371..20.414 rows=1.00 loops=1)
   Group Key: u.user_name
   Batches: 1  Memory Usage: 32kB
   Buffers: shared hit=930 read=3
   ->  Nested Loop  (cost=12.35..2925.97 rows=699 width=8) (actual time=0.465..15.479 rows=707.00 loops=1)
         Buffers: shared hit=930 read=3
         ->  Index Scan using users_pkey on users u  (cost=0.28..8.29 rows=1 width=16) (actual time=0.017..0.029 rows=1.00 loops=1)
               Index Cond: (user_id = 42)
               Index Searches: 1
               Buffers: shared hit=3
         ->  Bitmap Heap Scan on orders o  (cost=12.08..2910.69 rows=699 width=8) (actual time=0.423..6.631 rows=707.00 loops=1)
               Recheck Cond: (user_id = 42)
               Filter: (status = 'PAID'::text)
               Rows Removed by Filter: 279
               Heap Blocks: exact=927
               Buffers: shared hit=927 read=3
               ->  Bitmap Index Scan on idx_orders_user  (cost=0.00..11.90 rows=997 width=0) (actual time=0.183..0.188 rows=986.00 loops=1
)
                     Index Cond: (user_id = 42)
                     Index Searches: 1
                     Buffers: shared read=3
 Planning:
   Buffers: shared hit=29 read=2
 Planning Time: 0.383 ms
 Execution Time: 20.485 ms
(24 rows)

Time: 21.646 ms
test=# 
test=# 
test=# 
test=# DROP INDEX idx_orders_user;
DROP INDEX
Time: 1.940 ms
test=# DROP INDEX idx_orders_status;
DROP INDEX
Time: 1.755 ms
test=# 
test=# CREATE INDEX idx_orders_user_status
test-# ON orders (user_id, status);
CREATE INDEX
Time: 331.547 ms
test=# 
test=# 
test=# EXPLAIN (ANALYZE, BUFFERS)
test-# SELECT
test-#     u.user_name,
test-#     count(*)
test-# FROM
test-#     orders o
test-# JOIN users u ON o.user_id = u.user_id
test-# WHERE
test-#     o.user_id = 42
test-#     AND o.status = 'PAID'
test-# GROUP BY u.user_name;
                                                                        QUERY PLAN                                                        
                
------------------------------------------------------------------------------------------------------------------------------------------
----------------
 HashAggregate  (cost=37.18..37.19 rows=1 width=16) (actual time=13.835..13.870 rows=1.00 loops=1)
   Group Key: u.user_name
   Batches: 1  Memory Usage: 32kB
   Buffers: shared hit=4 read=3
   ->  Nested Loop  (cost=0.70..33.69 rows=699 width=8) (actual time=0.065..10.329 rows=707.00 loops=1)
         Buffers: shared hit=4 read=3
         ->  Index Scan using users_pkey on users u  (cost=0.28..8.29 rows=1 width=16) (actual time=0.012..0.023 rows=1.00 loops=1)
               Index Cond: (user_id = 42)
               Index Searches: 1
               Buffers: shared hit=3
         ->  Index Only Scan using idx_orders_user_status on orders o  (cost=0.42..18.41 rows=699 width=8) (actual time=0.038..3.787 rows=
707.00 loops=1)
               Index Cond: ((user_id = 42) AND (status = 'PAID'::text))
               Heap Fetches: 0
               Index Searches: 1
               Buffers: shared hit=1 read=3
 Planning:
   Buffers: shared hit=22 read=1
 Planning Time: 0.212 ms
 Execution Time: 13.919 ms
(19 rows)

Time: 14.738 ms
test=# 
test=#

这里我们分析一下

 Bitmap index
 
idx_orders_user
   ↓
Bitmap Index Scan
   ↓  (构建 bitmap)
Bitmap Heap Scan
   ↓  (回表 + Filter status)
Nested Loop
   ↓
HashAggregate

Buffers: shared hit=930 read=3

Heap Blocks: exact=927

Memory Usage: 32kB

index scan

idx_orders_user_status
   ↓
Index Only Scan
   ↓
Nested Loop
   ↓
HashAggregate

Buffers: shared hit=4 read=3 Heap Fetches: 0

其实从上面看一个关键的问题，使用了bitmap会浪费更多的内存，每个进程都要产生930 Buffer hit ,而使用index only scan 只用了 4 buffer hit

差距很大，这个问题在单一的查询中并不是一个关键尤其对现在大内存的数据库服务器，而如果并发超高的情况下就不痛，如果我们有100个并发的情况下，那么区别就比较大了

93000 和 400 的区别，这个内存的区别就变得越来越大了，

93,000 pages × 8 KB ≈ 744,000 KB ≈ 726 MB

400 pages × 8 KB ≈ 3,200 KB ≈ 3.1 MB

同时会产生更多的CPU的消耗，这里就不赘述了，所以在PostgreSQL优化的过程中，要注意查询中是否因为单独建立索引而导致走了bitmap 而因为没有建立联合索引去走index scan.

其实这里还有另一个问题，就是PostgreSQL或者其他的数据库产品都不会考虑你的一个SQL运行的并发性，他们仅仅是针对一次的操作来判断COST，因为bitmap调用IO更少，在数据库中更少的IO是SQL运行中希望看到的,同时我们也可以看到，下面的PG的COST的计算模式。

Total Cost =
  Seq/Index Page Cost
+ CPU tuple cost
+ CPU operator cost

但在并发高的情况下，的确更少的CPU的计算，更少的内存更有利，但是，如果是一个有充足的CPU和内存的数据库服务器呢？

PostgreSQL Bitmap scan 和 Index scan 优化一个SQL ，结果非常不一样！

PostgreSQL Bitmap scan 和 Index scan 优化一个SQL ，结果非常不一样！

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐