我想在BigQuery标准SQL中创建一个用户定义函数(CREATE TEMPORARY FUNCTION),它将接受从一堆行中聚合的值。
我的模式和表类似于:
| c1 | c2 | c3 | c4 |
|=======|=======|=======|=======|
| 1 | 1-1 | 3A | 4A |
| 1 | 1-1 | 3B | 4B |
| 1 | 1-1 | 3C | 4C |
| 1 | 1-2 | 3D | 4D |
| 2 | 2-1 | 3E | 4E |
| 2 | 2-1 | 3F | 4F |
| 2 | 2-2 | 3G | 4G |
| 2 | 2-2 | 3H | 4H |我不能将原始模式更改为由嵌套或数组字段组成。
我希望按c1和c2分组,并将c3和c4的值传递给函数,同时能够在每一行的c3和c4值之间进行匹配。一种方法是使用ARRAY_AGG并将值作为Array传递,但ARRAY_AGG是不确定的,因此来自c3和c4的值可能具有与源表不同的顺序。Example
CREATE TEMPORARY FUNCTION
tempConcatStrFunction(c3 ARRAY<STRING>, c4 ARRAY<STRING>)
RETURNS STRING
LANGUAGE js AS """
return
c3
.map((item, index) => [ item, c4[index] ].join(','))
.join(',');
""";
WITH T as (
SELECT c1, c2, ARRAY_AGG(c3) as c3, ARRAY_AGG(c4) as c4
GROUP BY c1, c2
)
SELECT c1, c2, tempConcatStrFunction(c3, c4) as str from T结果应该是:
| c1 | c2 | str |
|=======|=======|======================|
| 1 | 1-1 | 3A,4A,3B,4B,3C,4C |
| 1 | 1-2 | 3D,4D |
| 2 | 2-1 | 3E,4E,3F,4F |
| 2 | 2-2 | 3G,4G,3H,4H |有什么想法可以达到这样的效果吗?
发布于 2016-11-11 00:59:29
有什么想法可以达到这样的效果吗?
我知道你的问题是关于如何使c3和c4在最后的字符串中相互匹配。不如让它像下面这样保持超级简单
SELECT c1, c2, STRING_AGG(CONCAT(c3, ',', c4)) AS str
FROM yourTable
GROUP BY c1, c2发布于 2016-11-11 03:16:46
下面是几个可能有助于设置查询的示例:
WITH T AS (
SELECT 1 AS c1, '1-1' AS c2, '3A' AS c3, '4A' AS c4 UNION ALL
SELECT 1, '1-1', '3B', '4B' UNION ALL
SELECT 1, '1-1', '3C', '4C' UNION ALL
SELECT 1, '1-2', '3D', '4D' UNION ALL
SELECT 2, '2-1', '3E', '4E' UNION ALL
SELECT 2, '2-1', '3F', '4F' UNION ALL
SELECT 2, '2-2', '3G', '4G' UNION ALL
SELECT 2, '2-2', '3H', '4H'
)
SELECT
c1,
c2,
STRING_AGG(CONCAT(c3, ',', c4)) AS str
FROM T
GROUP BY 1, 2;它接受未聚合的输入(就像Mikhail的答案一样),并进行字符串连接。
如果输入已经聚合到数组中,理想情况下它们应该一起重复,例如:
WITH T AS (
SELECT 1 AS c1, '1-1' AS c2, '3A' AS c3, '4A' AS c4 UNION ALL
SELECT 1, '1-1', '3B', '4B' UNION ALL
SELECT 1, '1-1', '3C', '4C' UNION ALL
SELECT 1, '1-2', '3D', '4D' UNION ALL
SELECT 2, '2-1', '3E', '4E' UNION ALL
SELECT 2, '2-1', '3F', '4F' UNION ALL
SELECT 2, '2-2', '3G', '4G' UNION ALL
SELECT 2, '2-2', '3H', '4H'
),
U AS (
SELECT
c1,
c2,
ARRAY_AGG(STRUCT(c3, c4)) AS arr
FROM T
)
SELECT
c1,
c2,
(SELECT STRING_AGG(CONCAT(c3, ',', c4)) FROM UNNEST(arr)) AS str
FROM U
GROUP BY 1, 2;如果数组是分开的,但顺序(和长度)是一致的,那么您可以在事后重新组合它们:
WITH T AS (
SELECT 1 AS c1, '1-1' AS c2, '3A' AS c3, '4A' AS c4 UNION ALL
SELECT 1, '1-1', '3B', '4B' UNION ALL
SELECT 1, '1-1', '3C', '4C' UNION ALL
SELECT 1, '1-2', '3D', '4D' UNION ALL
SELECT 2, '2-1', '3E', '4E' UNION ALL
SELECT 2, '2-1', '3F', '4F' UNION ALL
SELECT 2, '2-2', '3G', '4G' UNION ALL
SELECT 2, '2-2', '3H', '4H'
),
U AS (
SELECT
c1,
c2,
ARRAY_AGG(c3 ORDER BY c3, c4) AS arr3,
ARRAY_AGG(c4 ORDER BY c3, c4) AS arr4
FROM T
GROUP BY 1, 2
)
SELECT
c1,
c2,
(SELECT STRING_AGG(CONCAT(arr4[OFFSET(off)], ',', c3))
FROM UNNEST(arr3) AS c3 WITH OFFSET off) AS str
FROM U;https://stackoverflow.com/questions/40526329
复制相似问题