我有一个postgres 13.3数据库运行在一台CentOS机器上。在执行这些测试时,这台机器上没有其他任何东西正在运行,在执行测试时,也没有其他任何东西访问数据库。
表卡默代尔大约有500.000行。我已经重复了这个实验,在其他包含大约50.000行的表中。结果是一样的,但减速似乎与所使用的fdw表中的行数以及生成的假id的数量有关。
运行此命令:
CREATE OR REPLACE FUNCTION jegfatterintet() RETURNS TABLE (c BIGINT)
AS $$
DECLARE
jammerdal_ids VARCHAR[];
area_row RECORD;
start TIMESTAMP;
BEGIN
SELECT INTO jammerdal_ids ARRAY_AGG('id-'||x::VARCHAR) FROM generate_series(0,25) x;
FOR area_row IN SELECT * FROM generate_series(1,10)
LOOP
SELECT INTO start clock_timestamp();
--RAISE NOTICE '%: Start %.', clock_timestamp(), area_row;
RETURN QUERY
SELECT COUNT(*) FROM jammerdal WHERE id = ANY(jammerdal_ids);
--RAISE NOTICE '%: End %.', clock_timestamp(), area_row;
RAISE NOTICE '%: Duration is %.', area_row, clock_timestamp()-start;
END LOOP;
RAISE NOTICE '%: All done.', clock_timestamp();
END
$$ LANGUAGE plpgsql;
SELECT * FROM jegfatterintet();产生这个输出:
CREATE FUNCTION
NOTICE: (1): Duration is 00:00:00.019555.
NOTICE: (2): Duration is 00:00:00.001271.
NOTICE: (3): Duration is 00:00:00.001089.
NOTICE: (4): Duration is 00:00:00.00118.
NOTICE: (5): Duration is 00:00:00.001035.
NOTICE: (6): Duration is 00:00:02.954527.
NOTICE: (7): Duration is 00:00:02.871185.
NOTICE: (8): Duration is 00:00:02.812426.
NOTICE: (9): Duration is 00:00:02.777037.
NOTICE: (10): Duration is 00:00:02.90708.
NOTICE: 2021-09-07 11:21:53.577115+00: All done.
c
---
0
0
0
0
0
0
0
0
0
0
(10 rows)注意,从步骤6向前看,持续时间是如何从0.01秒以下突然上升到几乎3s的。
这只发生在jammerdal是一个外部(fdw)表,而不是当它是本地的。只有在使用ids数组时才会发生这种情况。
如果我将函数更改为:
CREATE OR REPLACE FUNCTION jegfatterintet() RETURNS TABLE (c BIGINT)
AS $$
DECLARE
jammerdal_ids VARCHAR[];
area_row RECORD;
start TIMESTAMP;
BEGIN
SELECT INTO jammerdal_ids ARRAY_AGG('id-'||x::VARCHAR) FROM generate_series(0,25) x;
FOR area_row IN SELECT * FROM generate_series(1,10)
LOOP
SELECT INTO start clock_timestamp();
--RAISE NOTICE '%: Start %.', clock_timestamp(), area_row;
RETURN QUERY
SELECT COUNT(*) FROM jammerdal WHERE id IN ('id-0', 'id-1', 'id-2', 'id-3', 'id-4', 'id-5', 'id-6', 'id-7', 'id-8', 'id-9', 'id-10', 'id-11', 'id-12', 'id-13', 'id-14', 'id-15', 'id-16', 'id-17', 'id-18', 'id-19', 'id-20', 'id-21', 'id-22', 'id-23', 'id-24', 'id-25'); --id = ANY(jammerdal_ids);
--RAISE NOTICE '%: End %.', clock_timestamp(), area_row;
RAISE NOTICE '%: Duration is %.', area_row, clock_timestamp()-start;
END LOOP;
RAISE NOTICE '%: All done.', clock_timestamp();
END
$$ LANGUAGE plpgsql;
SELECT * FROM jegfatterintet();产出变成:
CREATE FUNCTION
NOTICE: (1): Duration is 00:00:00.028254.
NOTICE: (2): Duration is 00:00:00.001768.
NOTICE: (3): Duration is 00:00:00.001512.
NOTICE: (4): Duration is 00:00:00.001426.
NOTICE: (5): Duration is 00:00:00.001523.
NOTICE: (6): Duration is 00:00:00.001389.
NOTICE: (7): Duration is 00:00:00.001363.
NOTICE: (8): Duration is 00:00:00.001364.
NOTICE: (9): Duration is 00:00:00.001466.
NOTICE: (10): Duration is 00:00:00.001454.
NOTICE: 2021-09-07 11:25:46.635762+00: All done.
c
---
0
0
0
0
0
0
0
0
0
0
(10 rows)有人能给我解释一下吗?
编辑:
explain (ANALYZE, BUFFERS) select id from jammerdal where id = any('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Foreign Scan on jammerdal (cost=100.00..62578.95 rows=26 width=37) (actual time=0.718..0.719 rows=0 loops=1)
Planning Time: 0.153 ms
Execution Time: 1.101 ms
(3 rows)还有一个在“外文”数据库中运行的解释:
explain (ANALYZE, BUFFERS) select id from jammerdal where id = any('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using jammerdal_pkey on jammerdal (cost=0.42..219.41 rows=26 width=37) (actual time=0.286..0.290 rows=0 loops=1)
Index Cond: (id = ANY ('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}'::text[]))
Heap Fetches: 0
Buffers: shared hit=81
Planning:
Buffers: shared hit=205
Planning Time: 2.111 ms
Execution Time: 0.423 ms
(8 rows)顺便说一句
ANALYZE jammerdal;没有效果。
编辑2:问题很明显是fdw表没有对id使用索引.
ALTER SERVER testdb OPTIONS (ADD use_remote_estimate 'true');成功了!
发布于 2021-09-07 12:12:51
由于查询位于函数中,所以PostgreSQL缓存执行计划。这是根据一种特殊的启发式方法进行的:
对于前五次执行,(jammerdal_ids)生成一个使用实际参数值
。
如果是,则从第六次执行开始使用通用计划来节省计划时间(
)。
在你的情况下,一般的计划显然是不好的。
由于您没有显示EXPLAIN (ANALYZE, BUFFERS)输出,我们只能猜测原因。但是一个很好的猜测是,您忘记了ANALYZE外部表,并且有糟糕的统计数据。所以用它
ANALYZE jammerdal;你应该注意到一个进步。
(请注意,PostgreSQL不会自动收集外部表的统计信息。)
https://stackoverflow.com/questions/69087568
复制相似问题