我有各种商店出售的产品的数据。对于一些商店,它们是通过PROMO_FLG映射的折扣出售的。我想显示两个COUNT PARTITION列。
+-------------------------+--------------+---------------------+
| Store | Item | PROMO_FLG|
|-------------------------+--------------+---------------------|
| 1 | 1 | 0 |
| 2 | 1 | 1 |
| 3 | 1 | 0 |
| 4 | 1 | 0 |
| 5 | 1 | 1 |
| 6 | 1 | 1 |
| 7 | 1 | 1 |
| 8 | 1 | 0 |
| 9 | 1 | 0 |
| 10 | 1 | 0 |
+-------------------------+--------------+---------------------+首先显示所有拥有此产品的商店(已完成)
COUNT(DISTINCT STORE) OVER (PARTITION ITEM)会给is 10
第二个-我寻找的-只计算这些在PROMO_FLG = 1属性中有值的商店。
这应该会给我们带来4的价值
发布于 2020-04-08 23:15:42
我认为你想要:
select t.*,
count(*) over (partition by item) as num_stores,
sum(promo_flg) over (partition by item) as num_promo_1
from t;如果您确实需要不同的计数:
select t.*,
count(distinct store) over (partition by item) as num_stores,
count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from t;Here是一个db<>fiddle。小提琴使用Oracle,因为它支持COUNT(DISTINCT)作为窗口函数。
如果窗口函数不起作用,下面是另一种选择:
select *
from t join
(select item, count(distinct store) as num_stores, count(distinct case when promo_flg = 1 then store end) as num_stores_promo
from t
group by item
) tt
using (item);发布于 2020-04-09 07:56:38
使用Gordon,但在Snowflake中显示它可以工作
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg = 1, 1, 0)) over (partition by item) as num_sum_promo_stores
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 )
v(store, item, promo_flg) ;提供:
STORE ITEM PROMO_FLG NUM_STORES NUM_DIS_PROMO_STORES NUM_SUM_PROMO_STORES
1 1 0 10 4 4
2 1 1 10 4 4
3 1 0 10 4 4
4 1 0 10 4 4
5 1 1 10 4 4
6 1 1 10 4 4
7 1 1 10 4 4
8 1 0 10 4 4
9 1 0 10 4 4
10 1 0 10 4 4因此,根据您想要的是distinct计数,还是sum,两者兼而有之,我使用了snowflake支持iff的非标准sql形式,因为我喜欢它更小的SQL。但是你可以看到它们在工作。
测试Gordons second case count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1的工作方式与编写的一样。
为了回答Gordons answer上的Marcin2x4问题,如果/当数据偏离您所描述的数据时,您将从方法中获得不同的结果。因此,在这种情况下,您有一个商店,其中包含一个商品和多行,其中包含promo_flg。或者如果promo_flg具有非零值:
select v.*
,count(distinct store) over (partition by item) as num_stores
,count(distinct iff(promo_flg = 1, store, null)) over (partition by item) as num_dis_promo_stores
,sum(iff(promo_flg <> 0, 1, 0)) over (partition by item) as num_sum_promo_stores
,sum(promo_flg) over (partition by item) as num_promo_1
,count(distinct case when promo_flg = 1 then store end) over (partition by item) as num_promo_1
from values
(1 , 1, 0 ),
(2 , 1, 1 ),
(3 , 1, 0 ),
(4 , 1, 0 ),
(5 , 1, 1 ),
(6 , 1, 1 ),
(7 , 1, 1 ),
(8 , 1, 0 ),
(9 , 1, 0 ),
(10, 1, 0 ),
(7, 1, 1 ),
(7, 1, 2 )
v(store, item, promo_flg) ;然后num_dis_promo_stores & num_promo_1给4,num_sum_promo_stores给6,num_promo_1给7
https://stackoverflow.com/questions/61104072
复制相似问题