假设我有一个名为measurement的表。此表的目的是测量在特定“日期”的"series_id“的数值”值“(其本身由其他数据计算得出)。
现在,让我们使用"effective_start“(包含)和"effective_end”(包含)字段将生效日期添加到此表中。
DDL:
CREATE TABLE public.measurement
(
date date NOT NULL,
effective_end date NOT NULL,
effective_start date NOT NULL,
series_id character varying(255) COLLATE pg_catalog."default" NOT NULL,
value numeric,
CONSTRAINT measurement_pkey PRIMARY KEY (date, effective_end, effective_start, series_id)
)我的挑战是现在快速地,并且只使用SQL (我有Java代码和一个部分查询来解决这个问题),构造一个结果如下的查询:
对于所有系列,在特定的时间日期(查询参数),返回在查询的特定日期有效的最新(最大“日期”)度量。
我目前的"all-SQL“解决方案是一个视图,并结合了对该视图的查询:
视图的DDL:
CREATE OR REPLACE VIEW public.known_at AS
SELECT o.date,
o.effective_end,
o.effective_start,
o.series_id,
o.value
FROM measurement o
JOIN ( SELECT o_1.series_id,
min(o_1.effective_start) AS effective_start,
o_1.date
FROM measurement o_1
GROUP BY o_1.series_id, o_1.date) x ON o.series_id::text = x.series_id::text AND o.effective_start = x.effective_start AND o.date = x.date
JOIN ( SELECT o_1.series_id,
o_1.effective_start,
max(o_1.date) AS date
FROM measurement o_1
GROUP BY o_1.series_id, o_1.effective_start) y ON x.series_id::text = y.series_id::text AND x.effective_start = y.effective_start AND x.date = y.date
WHERE o.date <= o.effective_start
ORDER BY o.date DESC, o.series_id DESC;查询:
select k.* from known_at k
inner join (
select
k.series_id,
max(k.date) as date
from known_at k
-- the passed in date here is a parameter as described above
where k.date <= '2020-03-26'
group by k.series_id) as mx
on k.series_id = mx.series_id and k.date = mx.date
order by k.series_id;不幸的是,尽管series_id、date、effective_end和effective_start上有btree索引,但视图和查询的组合速度很慢(~400ms)。我怎样才能做得更好?
发布于 2021-03-27 13:22:03
我认为这个查询应该会给你想要的结果,尽管没有你的数据集,很难说它的性能会是什么样子。对于这个查询,我建议在(effective_start,effective_end,series_id,date DESC)上使用多列索引。
SELECT DISTINCT ON (series_id) *
FROM measurement
WHERE effective_start <= '2020-03-26' -- the passed-in date
AND effective_end >= '2020-03-26' -- the passed-in date
ORDER BY series_id, date DESC;说明:查询会筛选出在有效期内包含传入日期的行,然后对于筛选出的行中的每个series_id,取日期最大的行。
此外,您可能需要考虑对生效日期使用日期范围类型。范围类型附带了一些有用的范围运算符。
https://stackoverflow.com/questions/66828144
复制相似问题