首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >条件和的左连接侧

条件和的左连接侧
EN

Stack Overflow用户
提问于 2019-02-24 13:18:00
回答 2查看 317关注 0票数 2

我有一个与客户,产品和类别的购买数据集。

代码语言:javascript
复制
customer     product     category    sales_value
       A     aerosol     air_care             10
       B     aerosol     air_care             12
       C     aerosol     air_care              7
       A     perfume     air_care              8
       A     perfume     air_care              2
       D     perfume     air_care             11
       C      burger         food             13
       D       fries         food              6
       C       fries         food              9

我希望,对于每一个产品,在这个产品上花费的销售价值和这个产品类别上的销售价值之间的比率,由购买该产品至少一次的客户提供。

另一种说法是:以至少购买一次fries的客户为例,计算A)用于fries和B的销售价值之和,以及用于food的销售价值之和。

中间表的形式如下:

代码语言:javascript
复制
product    category  sum_spent_on_product           sum_spent_on_category    ratio
                                                 by_people_buying_product
aerosol    air_care                    29                              39     0.74
perfume    air_care                    21                              31     0.68
 burger        food                    13                              22     0.59
  fries        food                    15                              28     0.53

至少买过一次aerosol的人在这个产品上总共花了1800英镑。总的来说,同样的人在air_care类别上花费了3600英镑( aerosol属于这个类别)。因此,aerosol的比率为0.5。

我尝试用left join lateral来解决这个问题,并为每个product计算给定的中间结果,但是我无法思考如何包含条件only for customers who bought this specific product

代码语言:javascript
复制
select
    distinct (product_id)
  , category
  , c.sales_category
from transactions t
left join lateral (
  select
    sum(sales_value) as sales_category
  from transactions
  where category = t.category
  group by category
) c on true
;

上面的查询列出了每个产品在产品类别上的支出之和,但没有所需的产品-购买者条件。

left join lateral是正确的道路吗?在普通SQL中还有其他解决方案吗?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2019-02-24 20:31:17

我将使用一个窗口函数来计算每个类别中每个客户的总支出:

代码语言:javascript
复制
SELECT
  customer, product, category, sales_value,
  sum(sales_value) OVER (PARTITION BY customer, category) AS tot_cat
FROM transactions;

 customer | product | category | sales_value | tot_cat 
----------+---------+----------+-------------+---------
 A        | aerosol | air_care |       10.00 |   20.00
 A        | perfume | air_care |        8.00 |   20.00
 A        | perfume | air_care |        2.00 |   20.00
 B        | aerosol | air_care |       12.00 |   12.00
 C        | aerosol | air_care |        7.00 |    7.00
 C        | fries   | food     |        9.00 |   22.00
 C        | burger  | food     |       13.00 |   22.00
 D        | perfume | air_care |       11.00 |   11.00
 D        | fries   | food     |        6.00 |    6.00

那我们只需要总结一下。当客户多次购买相同的产品时,就会出现一个问题。在您的例子中,客户A已经购买了两次香水。为了克服这个问题,让我们同时按客户、产品和类别分组(并与sales_value列之和):

代码语言:javascript
复制
SELECT
  customer, product, category, SUM(sales_value) AS sales_value,
  SUM(SUM(sales_value)) OVER (PARTITION BY customer, category) AS tot_cat
FROM transactions
GROUP BY customer, product, category

 customer | product | category | sales_value | tot_cat 
----------+---------+----------+-------------+---------
 A        | aerosol | air_care |       10.00 |   20.00
 A        | perfume | air_care |       10.00 |   20.00 <-- this row summarizes rows 2 and 3 of previous result
 B        | aerosol | air_care |       12.00 |   12.00
 C        | aerosol | air_care |        7.00 |    7.00
 C        | burger  | food     |       13.00 |   22.00
 C        | fries   | food     |        9.00 |   22.00
 D        | perfume | air_care |       11.00 |   11.00
 D        | fries   | food     |        6.00 |    6.00

现在,我们只需将sales_value和tot_cat相加即可得到中间结果表。我使用一个公共表表达式来获取名为t的前面的结果

代码语言:javascript
复制
WITH t AS (
  SELECT
    customer, product, category, SUM(sales_value) AS sales_value,
    SUM(SUM(sales_value)) OVER (PARTITION BY customer, category) AS tot_cat
  FROM transactions
  GROUP BY customer, product, category
)
SELECT
  product, category,
  sum(sales_value) AS sales_value, sum(tot_cat) AS tot_cat,
  sum(sales_value) / sum(tot_cat) AS ratio
FROM t
GROUP BY product, category;

 product | category | sales_value | tot_cat |         ratio          
---------+----------+-------------+---------+------------------------
 aerosol | air_care |       29.00 |   39.00 | 0.74358974358974358974
 fries   | food     |       15.00 |   28.00 | 0.53571428571428571429
 burger  | food     |       13.00 |   22.00 | 0.59090909090909090909
 perfume | air_care |       21.00 |   31.00 | 0.67741935483870967742
票数 3
EN

Stack Overflow用户

发布于 2019-02-24 13:29:16

我希望,对于每一个产品,在这个产品上花费的销售价值和这个产品类别上的销售价值之间的比率,由购买该产品至少一次的客户提供。

如果我正确理解了这一点,您可以按人员和类别汇总销售情况,以获得该类别的总销售额。在Postgres中,您可以保留一个产品数组并将其用于匹配。因此,该查询看起来如下:

代码语言:javascript
复制
select p.product, p.category,
       sum(p.sales_value) as product_only_sales, 
       sum(pp.sales_value) as comparable_sales
from purchases p join
     (select customer, category, array_agg(distinct product) as products, sum(sales_value) as sales_value
      from purchases p
      group by customer, category
     ) pp
     on p.customer = pp.customer and p.category = pp.category and p.product = any (pp.products)
group by p.product, p.category;

这里是db<>fiddle。

编辑:

数据允许在产品的日期重复。把东西扔了。解决方案是为每个客户预先按产品进行聚合:

代码语言:javascript
复制
select p.product, p.category, sum(p.sales_value) as product_only_sales, sum(pp.sales_value) as comparable_sales
from (select customer, category, product, sum(sales_value) as sales_value
      from purchases p
      group by customer, category, product
     ) p join
     (select customer, category, array_agg(distinct product) as products, sum(sales_value) as sales_value
      from purchases p
      group by customer, category
     ) pp
     on p.customer = pp.customer and p.category = pp.category and p.product = any (pp.products)
group by p.product, p.category

这里是本例的db<>fiddle。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/54852276

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档