我想使用CTE在Snowflake中产生分层输出。下面是我的两个表Dimension_territory和territory_member_list:
维度testdb.dbo.dimension_territory包含区域键和区域名称
create table testdb.dbo.dimension_territory (
territory_key integer,
territory_name varchar ) ;其中,(1, 'World Wide')是根目录
insert into testdb.dbo.dimension_territory values
(1, 'WorldWide'),
(2, 'Western Hemisphere'),
(3, 'North America'),
(4, 'Canada') ;territory_member_list表格包含父子关系。
create table testdb.dbo.territory_member_list (
parent_territory_key integer,
child_territory_key integer );
insert into testdb.dbo.territory_member_list values
(1, 2), -- WorldWide , Western Hemisphere
(2, 3), -- Western Hemisphere , North America , North america under Western hemisphere
(3, 4) -- North America ,Canada , Canada under North America在testdb.dbo.territory_member_list (1, 2), -- WorldWide -> Western Hemisphere的第一个条目中,值1 'WordlWide‘是2’西半球‘(子级)的父级,同样
派生祖先和后代的CTE的预期输出应如下表所示。任何帮助都将不胜感激。包含所需字段的输出表:

发布于 2021-05-08 17:01:17
使用"breadcrumb“数组作为辅助结构来确定路径中的位置(朴素的方法):
WITH RECURSIVE cte AS (
SELECT dt.*, tml.parent_territory_key
FROM dimension_territory dt
LEFT JOIN territory_member_list tml
ON tml.child_territory_key = dt.territory_key
), rec AS (
SELECT cte.TERRITORY_KEY, cte.TERRITORY_NAME, cte.parent_territory_key,
ARRAY_CONSTRUCT(cte.TERRITORY_KEY) AS a, 0 AS lvl
FROM cte
WHERE PARENT_TERRITORY_KEY IS NULL
UNION ALL
SELECT cte.TERRITORY_KEY, cte.TERRITORY_NAME, cte.parent_territory_key,
ARRAY_APPEND(rec.a, cte.TERRITORY_KEY), lvl+1
FROM rec
JOIN cte ON rec.TERRITORY_KEY = cte.parent_territory_key
), lognest_path AS (
SELECT * FROM rec QUALIFY lvl = MAX(lvl) OVER()
), cartesian AS (
SELECT dt1.TERRITORY_KEY AS ANCESTOR_KEY, dt1.TERRITORY_NAME AS ANCESTOR_NAME,
dt2.TERRITORY_KEY AS DESCENDANT_KEY, dt2.TERRITORY_NAME AS DESCENDANT_NAME
FROM dimension_territory dt1
CROSS JOIN dimension_territory dt2
)
SELECT DISTINCT c.*, lp.a,
ARRAY_POSITION(c.ancestor_key,lp.a) AS a_p,
ARRAY_POSITION(c.descendant_key, lp.a) AS d_p,
a_p = 0 AND d_p = 0 AS IS_ROOT,
d_p - a_p AS EDGE_DISTANCE
FROM cartesian c
JOIN lognest_path lp
ON ARRAY_POSITION(c.ancestor_key,lp.a) >= 0
AND ARRAY_POSITION(c.descendant_key, lp.a) >=0
WHERE a_p <= d_p
ORDER BY ANCESTOR_KEY, DESCENDANT_KEY;输出:

发布于 2021-05-08 16:30:03
所以因为我使用了两个CTE来处理我的假表格/数据,为了使用Recursive CTE,如果它不是WITH之后的第一项,我就把它插入到一个子CTE中。一旦建立了关系,我们就可以再次双重连接维度以获得名称。
WITH dimension_territory(territory_key, territory_name) AS (
SELECT * FROM VALUES
(1, 'WorldWide'),
(2, 'Western Hemisphere'),
(3, 'North America'),
(4, 'Canada')
), territory_member_list(parent_territory_key, child_territory_key) AS (
SELECT * FROM VALUES
(1, 2),
(2, 3),
(3, 4)
), h_cte AS (
WITH RECURSIVE hierarchy(p_key, c_key, is_root, edge_distance) AS (
-- Anchor Clause
SELECT territory_key
,territory_key
,territory_key = 1
,0
FROM dimension_territory
--WHERE parent_territory_key = 1
UNION ALL
-- Recursive Clause
SELECT h.p_key
,ml.child_territory_key
,false
,edge_distance + 1
FROM territory_member_list AS ml
JOIN hierarchy AS h
ON ml.parent_territory_key = h.c_key --OR ml.child_territory_key =
)
SELECT * FROM hierarchy
)
SELECT d_p.territory_key as ancestor_territory_key
,d_p.territory_name as ancestor_territory_name
,d_c.territory_key as descendant_territory_key
,d_c.territory_name as descendant_territory_name
,h.is_root
,h.edge_distance
FROM h_cte as h
JOIN dimension_territory AS d_p
ON h.p_key = d_p.territory_key
JOIN dimension_territory AS d_c
ON h.c_key = d_c.territory_key
ORDER BY 1,2;提供:
ANCESTOR_TERRITORY_KEY ANCESTOR_TERRITORY_NAME DESCENDANT_TERRITORY_KEY DESCENDANT_TERRITORY_NAME IS_ROOT EDGE_DISTANCE
1 WorldWide 1 WorldWide TRUE 0
1 WorldWide 2 Western Hemisphere FALSE 1
1 WorldWide 3 North America FALSE 2
1 WorldWide 4 Canada FALSE 3
2 Western Hemisphere 2 Western Hemisphere FALSE 0
2 Western Hemisphere 3 North America FALSE 1
2 Western Hemisphere 4 Canada FALSE 2
3 North America 3 North America FALSE 0
3 North America 4 Canada FALSE 1
4 Canada 4 Canada FALSE 0因为您想要的输出是想要每个节点的子树,所以我在Anchor子句中选择了dimension_territory中的所有节点,这允许通过假设1是根来设置is_root,并将每个距离设置为0。从那里,递归子句将递归数据与边缘列表连接起来,以构建desendants集。
因此,为了去掉"data“CTE,它看起来像这样:
WITH RECURSIVE hierarchy(p_key, c_key, is_root, edge_distance) AS (
-- Anchor Clause
SELECT territory_key
,territory_key
,territory_key = 1
,0
FROM dimension_territory
UNION ALL
-- Recursive Clause
SELECT h.p_key
,ml.child_territory_key
,false
,edge_distance + 1
FROM territory_member_list AS ml
JOIN hierarchy AS h
ON ml.parent_territory_key = h.c_key --OR ml.child_territory_key =
)
SELECT d_p.territory_key as ancestor_territory_key
,d_p.territory_name as ancestor_territory_name
,d_c.territory_key as descendant_territory_key
,d_c.territory_name as descendant_territory_name
,h.is_root
,h.edge_distance
FROM hierarchy as h
JOIN dimension_territory AS d_p
ON h.p_key = d_p.territory_key
JOIN dimension_territory AS d_c
ON h.c_key = d_c.territory_key
ORDER BY 1,2;https://stackoverflow.com/questions/67443918
复制相似问题