我正在db.t2.xlarge实例(4 CPU 16 Gb )中运行AWS中的一个D0 11数据库,其中包含4 Tb的存储。我的查询是基于一个相当大的表(48 Gb -- 243.711.955行),但是没有什么是PostgreSQL无法处理的。
我正在运行一个简单的查询(我添加了查询计划,以防有任何洞察力),但是在大约10个小时之后,存储空间急剧下降到零,查询失败了:
ERROR: could not extend file "base/16401/321099.3525": No space left on device
HINT: Check free disk space.我试图用额外的存储来解决这个问题(我将1TB添加到数据库中,并在所有表上运行一个VACCUM --我知道在RDS实例中设置了autovaccum ),但是问题仍然存在。我担心增加空间可能不是可能的解决方案,因为数据库所需的时间不到1TB:
name | owner | size
----------------------+-------------+-----------
rdsadmin | rdsadmin | No Access
| | 317 GB我不完全理解PostgreSQL创建临时文件的方式,这些散列还是临时ORDER BY索引(?),但我的db中有大量的这些:
datname | Temporary files | Size of temporary files
----------------------+-----------------+-------------------------
rdsadmin | 0 | 0
template0 | 0 | 0
postgres | 0 | 0
| 1079 | 991097557473
template1 | 0 | 0我应该继续添加存储空间直到查询运行(RDS对于这些情况有一个“紧急存储”)吗?还是在运行查询之前缺少了某种配置?
编辑:添加查询。
查询:
with times as (
select name_source,
id_source,
run,
init_dt,
hour_along,
ntile(3) over (partition by run, date_part('year', init_dt), id_source order by hour_along) as t,
height,
traj_dt,
geom
from hysplit_process.clean_trajectories
),
dates_for_t as (
select distinct on (
id_source,
name_source,
run,
date_part('year', init_dt),
t) name_source,
id_source,
run,
t,
traj_dt
from times
order by id_source, name_source, run, date_part('year', init_dt), run, t, traj_dt desc
),
agg_trajs_by_t as (
select name_source,
id_source,
run,
t,
init_dt,
avg(st_x(geom)) as avg_lon,
avg(st_y(geom)) as avg_lat,
avg(height) as avg_height
from times
group by name_source, id_source, run, init_dt, t
)
select t.name_source,
t.id_source,
t.run,
t.t,
t.init_dt,
d.traj_dt,
st_setsrid(st_makepoint(t.avg_lon, t.avg_lat), 4326) as geom,
t.avg_height
from agg_trajs_by_t as t
left join dates_for_t as d
using (run, t);查询计划:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Right Join (cost=274164346.44..274773926.31 rows=24371195 width=104)
Merge Cond: ((d.run = t.run) AND (d.t = t.t))
CTE times
-> WindowAgg (cost=90827593.60..97529672.28 rows=243711952 width=100)
-> Sort (cost=90827593.60..91436873.48 rows=243711952 width=96)
Sort Key: clean_trajectories.run, (date_part('year'::text, clean_trajectories.init_dt)), clean_trajectories.id_source, clean_trajectories.hour_along
-> Seq Scan on clean_trajectories (cost=0.00..6897785.40 rows=243711952 width=96)
CTE dates_for_t
-> Unique (cost=72144047.24..73971886.88 rows=40000 width=56)
-> Sort (cost=72144047.24..72753327.12 rows=243711952 width=56)
Sort Key: times.run, times.t, times.traj_dt DESC
-> CTE Scan on times (cost=0.00..4874239.04 rows=243711952 width=56)
CTE agg_trajs_by_t
-> GroupAggregate (cost=88804047.24..95932621.83 rows=24371195 width=80)
Group Key: times_1.name_source, times_1.id_source, times_1.run, times_1.init_dt, times_1.t
-> Sort (cost=88804047.24..89413327.12 rows=243711952 width=96)
Sort Key: times_1.name_source, times_1.id_source, times_1.run, times_1.init_dt, times_1.t
-> CTE Scan on times times_1 (cost=0.00..4874239.04 rows=243711952 width=96)
-> Sort (cost=3857.54..3957.54 rows=40000 width=16)
Sort Key: d.run, d.t
-> CTE Scan on dates_for_t d (cost=0.00..800.00 rows=40000 width=16)
-> Materialize (cost=6726307.90..6848163.88 rows=24371195 width=80)
-> Sort (cost=6726307.90..6787235.89 rows=24371195 width=80)
Sort Key: t.run, t.t
-> CTE Scan on agg_trajs_by_t t (cost=0.00..487423.90 rows=24371195 width=80)发布于 2020-02-04 16:19:33
谢谢大家的建议。
在上述评论之后,我将CTE分解成几个较小的步骤。这不仅允许我使用更多的索引,而且还可以避免我的RDS实例崩溃。我希望这对未来的人们有所帮助,以防他们被PostgreSQL 11塞满。
https://dba.stackexchange.com/questions/258311
复制相似问题