我有两个问题。一个有"=“,另一个有">”作为过滤条件。我的问题是,为什么规划者计划只在条件为等于的情况下并行运行查询。
如果我处理parallel_tuple_cost和parallel_setup_cost的默认值,我可以使两者并行运行。但是我想知道为什么默认配置是这样计划的。
我正在提供我执行的查询和规划者解释输出。在fiter中使用的字段中没有索引。

这与两名工作人员并行进行:
VACUUM; EXPLAIN ANALYSE select screen_name from accounts where followers_count = 100;
(cost=1000.00..81857.06 rows=401 width=12) (actual time=0.442..149.981 rows=5028 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on accounts (cost=0.00..80816.96 rows=167 width=12) (actual time=0.100..113.974 rows=1676 loops=3)
Filter: (followers_count = 100)
Rows Removed by Filter: 1154464
Planning Time: 0.040 ms
Execution Time: 150.294 ms这与单个员工一起运行:
VACUUM; EXPLAIN ANALYSE select screen_name from accounts where followers_count > 100;
Seq Scan on accounts (cost=0.00..106105.64 rows=2278992 width=12) (actual time=2.608..562.870 rows=2286022 loops=1)
Filter: (followers_count > 100)
Rows Removed by Filter: 1182399
Planning Time: 0.059 ms
JIT:
Functions: 4
Options: Inlining false, Optimization false, Expressions true, Deforming true"
Timing: Generation 0.369 ms, Inlining 0.000 ms, Optimization 0.186 ms, Emission 2.316 ms, Total 2.871 ms"
Execution Time: 640.952 ms查询在筛选条件中只有不同的符号。
使用不太保守的设置:
SET parallel_tuple_cost to 0;
SET parallel_setup_cost to 0;
SET max_parallel_workers_per_gather to 4;
VACUUM; EXPLAIN ANALYSE select * from accounts where followers_count > 100;
Gather (cost=0.00..73589.74 rows=2278693 width=121) (actual time=0.220..391.700 rows=2286022 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Parallel Seq Scan on accounts (cost=0.00..73589.74 rows=569673 width=121) (actual time=0.024..117.873 rows=457204 loops=5)
Filter: (followers_count > 100)
Rows Removed by Filter: 236480
Planning Time: 0.065 ms
Execution Time: 484.223 ms使用默认设置:
SET parallel_tuple_cost to 0.1;
SET parallel_setup_cost to 1000;
SET max_parallel_workers_per_gather to 2;
VACUUM; EXPLAIN ANALYSE select * from accounts where followers_count > 100;
Seq Scan on accounts (cost=0.00..106100.83 rows=2278739 width=121) (actual time=2.610..552.259 rows=2286022 loops=1)
Filter: (followers_count > 100)
Rows Removed by Filter: 1182399
Planning Time: 0.083 ms
JIT:
Functions: 2
" Options: Inlining false, Optimization false, Expressions true, Deforming true"
" Timing: Generation 0.462 ms, Inlining 0.000 ms, Optimization 0.175 ms, Emission 2.239 ms, Total 2.876 ms"
Execution Time: 634.788 ms发布于 2020-10-07 21:20:25
您的>查询返回的行比=多得多。从并行工作者返回的每一行都需要从并行工作者传递给领导者,这样做是不免费的。(这是parallel_tuple_cost核算的成本)。另一方面,由并行工作人员过滤掉的行不受此费用的影响,因为它不会被放弃。如果大多数行将被过滤掉,则并行进行筛选是有意义的。如果有少数行将被过滤掉,那么并行地进行筛选可能没有意义,但随后在进程之间来回移动数据的额外工作就没有意义了。
您没有向我们展示的是,并行执行>查询是否比在单个进程中执行它要快。也许你需要把parallel_tuple_cost调小一些。默认设置相当保守。
https://dba.stackexchange.com/questions/276709
复制相似问题