我的postgresql数据库有一个表electrical_measurement,其中包含大约50亿条记录。我对每一列都有索引。我正在尝试执行以下查询,但它永远不会结束。想知道如何修改它才能运行得更快。
SELECT
em.id AS em_id,
em.test_board_id_in,
em.die,
tvt_net.name,
mb_pad_map.x,
mb_pad_map.y,
em.temperature,
em.timestamp,
em.avg_meas_voltage
FROM electrical_measurement AS em
INNER JOIN main_board_pad_map AS mb_pad_map
ON em.net_id_in = mb_pad_map.net_id
INNER JOIN tvt_net
ON em.net_id_in = tvt_net.id
WHERE em.assembly_id = 1
AND em.net_id_in IN
(SELECT em.net_id_in
FROM electrical_measurement AS em
WHERE em.assembly_id = 1
AND em.avg_meas_voltage > 0
GROUP BY em.net_id_in)
ORDER BY em.timestamp这是来自EXPLAIN的结果
-------------------------------------------------------------------------------------------------------------------------------------------------------
Gather Merge (cost=373158311.30..573643901.29 rows=1718327938 width=63)
Workers Planned: 2
-> Sort (cost=373157311.28..375305221.20 rows=859163969 width=63)
Sort Key: em."timestamp"
-> Hash Join (cost=84935808.04..171830022.94 rows=859163969 width=63)
Hash Cond: (em.net_id_in = mb_pad_map.net_id)
-> Hash Join (cost=84935424.26..161155613.60 rows=118993479 width=41)
Hash Cond: (em.net_id_in = em_1.net_id_in)
-> Parallel Bitmap Heap Scan on electrical_measurement em (cost=2996320.29..78903135.78 rows=118993479 width=37)
Recheck Cond: (assembly_id = 1)
-> Bitmap Index Scan on electrical_measurement_assembly_id_idx (cost=0.00..2924924.21 rows=285584350 width=0)
Index Cond: (assembly_id = 1)
-> Hash (cost=81939087.68..81939087.68 rows=1303 width=4)
-> HashAggregate (cost=81939061.62..81939074.65 rows=1303 width=4)
Group Key: em_1.net_id_in
-> Bitmap Heap Scan on electrical_measurement em_1 (cost=2953194.68..81656356.93 rows=113081878 width=4)
Recheck Cond: (assembly_id = 1)
Filter: (avg_meas_voltage > '0'::numeric)
-> Bitmap Index Scan on electrical_measurement_assembly_id_idx (cost=0.00..2924924.21 rows=285584350 width=0)
Index Cond: (assembly_id = 1)
-> Hash (cost=266.17..266.17 rows=9408 width=38)
-> Hash Join (cost=42.32..266.17 rows=9408 width=38)
Hash Cond: (mb_pad_map.net_id = tvt_net.id)
-> Seq Scan on main_board_pad_map mb_pad_map (cost=0.00..199.08 rows=9408 width=16)
-> Hash (cost=26.03..26.03 rows=1303 width=22)
-> Seq Scan on tvt_net (cost=0.00..26.03 rows=1303 width=22)
(26 rows)你有什么意见建议?谢谢
发布于 2019-05-29 05:47:19
您可以尝试使用窗口函数:
SELECT . . .
FROM (SELECT em.*,
COUNT(*) FILTER (WHERE em.assembly_id = 1 AND em.avg_meas_voltage) OVER (PARTITION BY em.net_id_in) as cnt
FROM electrical_measurement em
) em JOIN
main_board_pad_map mbpm
ON em.net_id_in = mbpm.net_id JOIN
tvt_net
ON em.net_id_in = tv.id
WHERE em.assembly_id = 1 AND
cnt > 0
ORDER BY em.timestamp发布于 2019-05-30 04:51:30
子查询是杀手级的。需要移除它,并以另一种方式完成。感谢您的评论。
https://stackoverflow.com/questions/56349468
复制相似问题