首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >组的最佳索引:+ Join + Where条件

组的最佳索引:+ Join + Where条件
EN

Database Administration用户
提问于 2019-10-11 20:23:45
回答 1查看 96关注 0票数 1

我有一个数据库,其中包括以下表格:

代码语言:javascript
复制
CREATE TABLE metadata
(
    keyword_id bigint NOT NULL,
    campaign_id bigint NOT NULL,
    org_id bigint NOT NULL,
    keyword character varying(255) COLLATE pg_catalog."default",
    bid_currency character varying(5) COLLATE pg_catalog."default",
    bid_amount double precision,
    status character varying(16) COLLATE pg_catalog."default",
    ad_group_id bigint,
    keyword_type character varying(16) COLLATE pg_catalog."default" NOT NULL,
    country_or_region character varying(3) COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT idx_primary PRIMARY KEY (org_id, keyword_type, country_or_region, campaign_id, keyword_id)
);

CREATE TABLE report
(
    org_id bigint NOT NULL,
    campaign_id bigint NOT NULL,
    ad_group_id bigint NOT NULL,
    keyword_id bigint NOT NULL,
    keyword character varying(255) COLLATE pg_catalog."default",
    impressions bigint,
    taps bigint,
    installs bigint,
    local_spend_currency character varying(5) COLLATE pg_catalog."default",
    report_date date NOT NULL,
    country_or_region character varying(5) COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT unx_org_cty_camp_key_date_di UNIQUE (org_id, country_or_region, campaign_id, keyword_id, report_date)

);

我有一个查询,如下所示:

代码语言:javascript
复制
Select 
r.report_date as reportDate, 
max(r.local_spend_currency) as currency, 
sum(r.local_spend_amount) as spend, 
sum(r.impressions) as impressions, 
sum(r.taps) as taps, 
...
from report r 
left join  metadata m  
    on m.org_id = r.org_id 
    and m.country_or_region = r.country_or_region 
    and m.campaign_id = r.campaign_id  
    and m.keyword_id = r.keyword_id  
 where m.org_id = 1 
     and m.keyword_type = 'KEYWORD'
           and (r.report_date between '2019-09-01' and '2019-10-10')
    group by r.report_date 
 order  by r.report_date
    offset 0  limit 180;

对于这个查询,我想要创建索引。我刚刚创建了一些索引,但它们不起作用。

现在,我定义的索引是:

元数据:

代码语言:javascript
复制
PRIMARY KEY (org_id, keyword_type, country_or_region, campaign_id, keyword_id);

报告:

代码语言:javascript
复制
unx_org_cty_camp_key_date_di (org_id, country_or_region, campaign_id, keyword_id, report_date);
idx_kr_org_date(org_id, report_date);

解释结果如下:

代码语言:javascript
复制
"Limit  (cost=2147246.64..2147248.79 rows=33 width=294)"
"  ->  GroupAggregate  (cost=2147246.64..2147248.79 rows=33 width=294)"
"        Group Key: m.org_id, m.country_or_region, m.campaign_id, m.keyword_id"
"        ->  Sort  (cost=2147246.64..2147246.72 rows=33 width=98)"
"              Sort Key: m.country_or_region, m.campaign_id, m.keyword_id"
"              ->  Gather  (cost=489239.37..2147245.81 rows=33 width=98)"
"                    Workers Planned: 2"
"                    ->  Parallel Hash Join  (cost=488239.37..2146242.51 rows=14 width=98)"
"                          Hash Cond: (((r.country_or_region)::text = (m.country_or_region)::text) AND (r.campaign_id = m.campaign_id) AND (r.keyword_id = m.keyword_id))"
"                          ->  Parallel Bitmap Heap Scan on report r  (cost=22465.50..1661304.60 rows=332449 width=99)"
"                                Recheck Cond: ((org_id = 479360) AND (report_date >= '2019-09-01'::date) AND (report_date <= '2019-10-10'::date))"
"                                ->  Bitmap Index Scan on idx_kr_org_date  (cost=0.00..22266.03 rows=797877 width=0)"
"                                      Index Cond: ((org_id = 479360) AND (report_date >= '2019-09-01'::date) AND (report_date <= '2019-10-10'::date))"
"                          ->  Parallel Hash  (cost=443861.05..443861.05 rows=900390 width=26)"
"                                ->  Parallel Seq Scan on metadata m  (cost=0.00..443861.05 rows=900390 width=26)"
"                                      Filter: ((org_id = 479360) AND ((keyword_type)::text = 'KEYWORD'::text))"

解释(分析):

代码语言:javascript
复制
"Limit  (cost=2397014.94..2397019.73 rows=87 width=268) (actual time=121193.808..122001.572 rows=97 loops=1)"
"  ->  GroupAggregate  (cost=2397014.94..2397019.73 rows=87 width=268) (actual time=121193.807..122001.553 rows=97 loops=1)"
"        Group Key: r.report_date"
"        ->  Sort  (cost=2397014.94..2397015.16 rows=87 width=72) (actual time=121189.172..121451.846 rows=2295035 loops=1)"
"              Sort Key: r.report_date"
"              Sort Method: external merge  Disk: 184168kB"
"              ->  Gather  (cost=466773.88..2397012.14 rows=87 width=72) (actual time=118071.231..120119.593 rows=2295035 loops=1)"
"                    Workers Planned: 2"
"                    Workers Launched: 2"
"                    ->  Parallel Hash Join  (cost=465773.88..2396003.44 rows=36 width=72) (actual time=118085.595..119674.465 rows=765012 loops=3)"
"                          Hash Cond: (((r.country_or_region)::text = (m.country_or_region)::text) AND (r.campaign_id = m.campaign_id) AND (r.keyword_id = m.keyword_id))"
"                          ->  Parallel Seq Scan on report r  (cost=0.00..1889731.54 rows=877718 width=99) (actual time=5236.366..112636.044 rows=765012 loops=3)"
"                                Filter: ((report_date >= '2019-07-01'::date) AND (report_date <= '2019-10-10'::date) AND (org_id = 479360))"
"                                Rows Removed by Filter: 14384864"
"                          ->  Parallel Hash  (cost=443861.05..443861.05 rows=900390 width=26) (actual time=4760.854..4760.854 rows=727133 loops=3)"
"                                Buckets: 65536  Batches: 64  Memory Usage: 2688kB"
"                                ->  Parallel Seq Scan on metadata m  (cost=0.00..443861.05 rows=900390 width=26) (actual time=0.009..4326.621 rows=727133 loops=3)"
"                                      Filter: ((org_id = 479360) AND ((keyword_type)::text = 'KEYWORD'::text))"
"                                      Rows Removed by Filter: 2826534"
"Planning Time: 0.436 ms"
"Execution Time: 122084.375 ms"

因此,我希望为元数据和报表创建索引,以涵盖所有条件。我怎样才能创造它们?

EN

回答 1

Database Administration用户

发布于 2019-10-12 07:31:25

删除的行非常高,因此它将是在report_date上进行索引的第一个指示。但是,如果您的数据对于report_date具有非常大的基数,那么索引选择可能会有所不同。

您试过最后通过desc空添加report_date订单的索引吗?

其中一个合并是使用磁盘,这是昂贵的。因此,在report_date上添加索引会有所帮助。

还考虑为org_id设置另一个列索引。

票数 0
EN
页面原文内容由Database Administration提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://dba.stackexchange.com/questions/250883

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档