首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从countMerge materialized_view中选择

从countMerge materialized_view中选择
EN

Stack Overflow用户
提问于 2020-10-21 10:43:59
回答 1查看 387关注 0票数 0

最近我开始使用clickhouse,我遇到了一些麻烦。我使用带有3个碎片的集群,每个碎片都有一个额外的复制,因此总共有6个服务器。

我在包含完整数据的本地表上创建本地MV (称为tbl),并基于本地MV创建分布式MV。包含完整数据的本地表,或tbl使用ReplicatedMergeTree作为引擎。本地MV使用ReplicatedAggregatingMergeTree作为引擎。此外,我在测试数据库上重新创建了每个带有填充的MVs。这样,就不可能重复插入。

问题是,当我从分布式MV中选择countMerge时,我得到了正确答案的两倍(即,如果正确答案为50,则得到100),而从分布式MV中选择uniqExactMerge将给出正确的结果。

这是我的剧本:

本地MV脚本:

代码语言:javascript
复制
CREATE DATABASE IF NOT EXISTS test ON CLUSTER cc_cluster;
CREATE MATERIALIZED VIEW IF NOT EXISTS
test.user_event_stat_scene_mv_local_test_v2 ON CLUSTER cc_cluster
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/{layer}-{shard}/test.user_event_stat_scene_mv_local_test_v2', '{replica}')
PARTITION BY (dt)
ORDER BY (dt, scene)
POPULATE
AS select
    countState(1) as expos,
    countState(if(click>0, 1, null)) as clicks,
    sumState(if(click=1, watch, 0)) as dr,
    countState(if(click=1 and tbl.cost_cnt>0, 1, null)) as cost_cnt,
    sumState(if(click=1, tbl.cost_total, 0)) as cost_total,
    countState(if(click=1 and tbl.chat_cnt>0, 1, null)) as chat_cnt,
    uniqExactState(recom_token) as item_expos,
    uniqExactState(if(click=1, recom_token, null)) as item_clicks,
    uniqExactState(uid) as user_expos,
    uniqExactState(if(click=1, uid, null)) as user_clicks,
    uniqExactState(if(click=1 and tbl.cost_cnt>0, uid, null)) as user_costs,
    uniqExactState(if(click=1 and tbl.chat_cnt>0, uid, null)) as user_chats,
    scene,
    toFixedString(dt, 8) as dt
FROM recom_stats_dws.user_event_log_day_local as tbl
GROUP BY dt, scene;

分布式MV脚本如下:

代码语言:javascript
复制
CREATE TABLE IF NOT EXISTS
test.user_event_stat_scene_mv_all_test_v2
ON CLUSTER cc_cluster
AS test.user_event_stat_scene_mv_local_test_v2
ENGINE = Distributed(cc_cluster, test, user_event_stat_scene_mv_local_test_v2, rand());

查询脚本是:

代码语言:javascript
复制
select
    countMerge(expos) as expos,
    countMerge(clicks) as clicks,
    sumMerge(dr) as dr,
    countMerge(cost_cnt) as cost_cnt,
    sumMerge(cost_total) as cost_total,
    countMerge(chat_cnt) as chat_cnt, 
    uniqExactMerge(item_expos) as item_expos, 
    uniqExactMerge(item_clicks) as item_clicks, 
    uniqExactMerge(user_expos) as user_expos,
    uniqExactMerge(user_clicks) as user_clicks,
    uniqExactMerge(user_costs) as user_costs,
    uniqExactMerge(user_chats) as user_chats,
    scene,
    dt
FROM test.user_event_stat_scene_mv_all_test_v2 as tbl
GROUP BY dt, scene
order by dt, scene;

PS:

包含完整数据的test.user_event_stat_scene_mv_local_test_v2

  • distributed MV = test.user_event_stat_scene_mv_all_test_v2

  • local表= recom_stats_dws.user_event_log_day_local

哪一部分我可能做错了什么?希望能得到你的帮助XD

EN

回答 1

Stack Overflow用户

发布于 2020-10-21 14:14:12

https://github.com/ClickHouse/ClickHouse/issues/16208

你已经执行了两次填充。在每个复制品上。它的数据翻了一番。

您只需要在一个副本上运行填充。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64461891

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档