我有数以千计的用户,我正在努力寻找他们最常见的部分。数据如下所示:
User Segment
User 1 Good
User 1 Good
User 1 Poor
user 2 Medium
user 2 Medium
User 3 Poor在这种情况下,sql代码将返回:
对用户1有好处
用户2的媒体
不适合用户3
我尝试使用Max函数,但它只按字母顺序返回字符串,而不是与每个用户相关的最常见的字符串。作为fyi,细分市场可以在不同用户之间频繁更改。因此,它很可能会改变8-10次,但要寻找最能代表它们的那个。
谢谢!
发布于 2020-03-17 23:12:33
您可以使用row_number()和count()分析函数来执行此操作:
select user,segment
from (
select user, segment, cnt,
row_number() over(partition by user,segment order by cnt desc) as rn
from (
select user, segment,
count(segment) over(partition by user,segment) as cnt
from table) t1
) t2
where rn = 1;发布于 2020-03-17 23:17:54
您可以使用聚合,如下所示:
select user,
coalesce(max(case when segment = 'Good' then segment end),
max(case when segment = 'Medium' then segment end),
max(case when segment = 'Poor' then segment end)
)
from t
group by user;发布于 2020-03-17 23:48:43
因为window functions是在group by之后应用的,所以你也可以这样做。您可以将row_number()替换为rank()或dense_rank()函数,具体取决于您希望如何处理平局。
select user, segment
from
(select user, segment, row_number() over (partition by user order by count(*) desc) as rn
from your_table
group by user, segment) t
where rn=1https://stackoverflow.com/questions/60725042
复制相似问题