我在缓慢的delete查询中遇到了问题。我有一个模式,称为"target“,其中包含的表在另一个模式中都有一个等价表(相同的列和主键),称为"delta”。现在,我想从目标模式中删除增量模式中出现的所有行。我已经尝试过使用DELETE FROM WHERE EXISTS方法,但这似乎非常慢。下面是一个查询示例:
DELETE FROM "target".name2phoneme
WHERE EXISTS(
SELECT 1 FROM delta.name2phoneme d
WHERE name2phoneme.NAME_ID = d.NAME_ID
AND name2phoneme.PHONEME_ID = d.PHONEME_ID
);这是两个表的布局(除了“增量”模式只有主键而没有外键)
CREATE TABLE name2phoneme
(
name_id uuid NOT NULL,
phoneme_id uuid NOT NULL,
seq_num numeric(3,0),
CONSTRAINT pk_name2phoneme PRIMARY KEY (name_id, phoneme_id),
CONSTRAINT fk_name2phoneme_name_id_2_name FOREIGN KEY (name_id)
REFERENCES name (name_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
DEFERRABLE INITIALLY DEFERRED,
CONSTRAINT fk_name2phoneme_phoneme_id_2_phoneme FOREIGN KEY (phoneme_id)
REFERENCES phoneme (phoneme_id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE NO ACTION
DEFERRABLE INITIALLY DEFERRED
)“目标”表最初包含18M多一点的行,而增量表包含大约370M行(将从目标中删除)。
下面是上面查询的EXPLAIN的输出:
"Delete on name2phoneme (cost=154858.03..1068580.46 rows=6449114 width=12)"
" -> Hash Join (cost=154858.03..1068580.46 rows=6449114 width=12)"
" Hash Cond: ((name2phoneme.name_id = d.name_id) AND (name2phoneme.phoneme_id = d.phoneme_id))"
" -> Seq Scan on name2phoneme (cost=0.00..331148.16 rows=18062616 width=38)"
" -> Hash (cost=69000.01..69000.01 rows=3763601 width=38)"
" -> Seq Scan on name2phoneme d (cost=0.00..69000.01 rows=3763601 width=38)"我试图解释分析上面的查询,但执行花费了2个小时,所以我终止了它。
对如何优化这个操作有什么想法吗?
发布于 2017-11-21 08:08:29
删除370万行非常耗时,因为查找每一行,然后记录和删除这些行的开销很大。仅仅考虑所有脏页、日志记录和缓存未命中都令人难以置信--更不用说对索引的更新了。
因此,类似这样的操作可能会快得多:
create temporary table temp_n2p as
select n2p.*
from "target".name2phoneme n2p
where not exists (select 1
from delta.name2phoneme d
where n2p.NAME_ID = d.NAME_ID and
n2p.PHONEME_ID = d.PHONEME_ID
);
truncate table "target".name2phoneme;
insert into "target".name2phoneme
select *
from temp_n2p;还应该在截断之前删除索引,然后在截断后重新创建它们。
发布于 2017-11-21 08:06:23
您是否尝试过这两种方法之一:
DELETE
FROM "target".name2phoneme t
USING delta.name2phoneme d
WHERE t.NAME_ID = d.NAME_ID
AND t.PHONEME_ID = d.PHONEME_ID
;或者使用WITH,但是Postgres确实实现了CTE,所以我不确定这在您的需求范围内是明智的。
WITH cte AS (
SELECT t.name_id, t.phoneme_id
FROM "target".name2phoneme t
INNER JOIN delta.name2phoneme d ON t.NAME_ID = d.NAME_ID
AND t.PHONEME_ID = d.PHONEME_ID
)
DELETE FROM "target".name2phoneme t
USING cte d
WHERE t.NAME_ID = d.NAME_ID
AND t.PHONEME_ID = d.PHONEME_ID
;https://stackoverflow.com/questions/47402098
复制相似问题