我想不出一个有效的查询来比较两个具有不同属性的表。这是一个在线零售商的报告,他有几十万个可供销售的SKU。每个SKU都是“父”产品的变体。他们在不同的市场上销售,并需要查看是否有在不同地方无法出售的商品。
其中一个表包含所有父产品,另一个表包含其对应SKU的所有变体。在第三个表中,他们有每个sku (变体)的完整列表,以及它对应的市场,其中sku + marketplace的组合是唯一的。
数据库使用PostgreSQL
表结构如下:
产品表:
Products
id | parent_sku | vendor_id
-------------------------------
1 | ABC | 100
2 | DEF | 200
3 | XYZ | 100变量表:
Variations
id | parent_id | sku
----------------------------
1 | 1 | ABC-1
2 | 1 | ABC-2
3 | 1 | ABC-3
4 | 2 | DEF-1
5 | 2 | DEF-2
6 | 3 | XYZ-1
7 | 3 | XYZ-2Marketplace表:
MarketplaceData
id | sku | marketplace | price
----------------------------
1 | ABC-1 | website1 | 99.99
2 | ABC-2 | website1 | 99.99
3 | ABC-3 | website1 | 89.99
4 | DEF-1 | website1 | 29.99
5 | DEF-2 | website1 | 29.99
6 | XYZ-1 | website1 | 39.99
7 | XYZ-2 | website1 | 39.99
8 | ABC-1 | website2 | 99.99
9 | ABC-2 | website2 | 99.99
10 | ABC-3 | website2 | 99.99
11 | DEF-1 | website2 | 29.99
12 | DEF-2 | website2 | 29.99
13 | XYZ-1 | website2 | 34.99
14 | XYZ-2 | website2 | 34.99我有一个可以工作的查询,但它执行起来非常长,而且非常繁重。
SELECT DISTINCT parent_id FROM Variations
WHERE sku IN (SELECT sku FROM MarketplaceData WHERE marketplace IN ('website1','website2'))
AND sku NOT IN (SELECT sku FROM MarketplaceData WHERE marketplace IN ('website3','website4'))
LIMIT 20 OFFSET 0 由于每个sku + marketplace数据集有近400,000行,而MarketplaceData表包含超过200万行,因此此查询永远需要执行。
在索引方面,id列是每个索引的主键。Variations表在sku上有索引(必须是唯一的),MarketplaceData在sku + marketplace上索引。
最终,我需要的是满足条件的唯一的parent_id的列表。
任何帮助或指导都将不胜感激。
谢谢!
发布于 2019-03-24 17:12:44
您可以使用INNER JOIN和LEFT JOIN检查null,而不是IN和NOT In
SELECT DISTINCT v.parent_id
FROM Variations v
INNER JOIN (
SELECT sku FROM MarketplaceData WHERE marketplace IN ('website1','website2')
) t1 on t1.sku = v.sku
LEFT JOIN (
SELECT sku FROM MarketplaceData WHERE marketplace IN ('website3','website4')
) t2 On t2.sku = v.sku
WHERE t2.sku is null发布于 2019-03-24 17:52:44
为什么只使用一个子查询?
SELECT DISTINCT parent_id
FROM Variations
WHERE sku IN (SELECT sku FROM MarketplaceData WHERE marketplace IN ('website1','website2')
except
SELECT sku FROM MarketplaceData WHERE marketplace IN ('website3','website4'))
LIMIT 20 OFFSET 0 发布于 2019-03-24 20:36:38
使用简单的聚合来获取skus如何?
select mpd.sku
from MarketplaceData mpd
where mpd.marketplace in ('website1', 'website2', 'website3', 'website4')
group by mpd.sku
having count(*) filter (where mpd.marketplace in ('website1', 'website2')) > 0 and
count(*) filter (where mpd.marketplace in ('website3', 'website4')) = 0;然后获取父ids:
select distinct v.parent_id
from variations v join
(select mpd.sku
from MarketplaceData mpd
where mpd.marketplace in ('website1', 'website2', 'website3', 'website4')
group by mpd.sku
having count(*) filter (where mpd.marketplace in ('website1', 'website2')) > 0 and
count(*) filter (where mpd.marketplace in ('website3', 'website4')) = 0
) m
on m.sku = v.sku;https://stackoverflow.com/questions/55322146
复制相似问题