当我试图提高查询性能时,我发现时隙利用率很低,在很长一段时间里只使用了100个插槽(2000插槽是限制)。
通过对日志文件的研究,我发现瓶颈阶段的“parallelInputs”只有80。但是,输入的行数约为1亿行,没有重复的值。因此,我认为查询计划器应该在瓶颈阶段之前增加阶段的输出汇。
是否有办法鼓励查询规划器增加输出接收器?
==============================
我已经用“联合”方法解决了我的性能问题。
在我的例子中,我使用了子查询,而不是像下面这样的视图
with slow_stage as (
...
), multiple_read as (
select
*
from slow_stage where MOD(key, 3)=0
union all
select
*
from slow_stage where MOD(key, 3)=1
...
)我预料会有三次改善。然而,由于slow_stage输出汇的增加,改善幅度超过了5%。以前,slow_stage只有80个输出汇。在应用“联合”方法后,它有1000多个输出汇。
我认为输出汇的数量取决于输出的大小和形状。(我参考了“谷歌BigQuery:权威指南”)在这个例子中,我没有改变slow_stage的输出,所以我不知道主要改进的原因。
发布于 2020-05-10 08:14:23
有一种你可以谨慎使用的黑客:
下面的查询将yourDataset.yourTable表按col1值分成16个碎片,这不会增加成本,因为BigQuery按扫描表的大小收费。
同样,要谨慎使用它,因为黑客攻击在未来可能不会有效。
CREATE VIEW yourdataset.yourJoinedTable AS (
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '0' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '1' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '2' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '3' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '4' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '5' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '6' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '7' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '8' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = '9' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = 'a' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = 'b' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = 'c' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = 'd' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = 'e' UNION ALL
SELECT * FROM yourDataset.yourTable WHERE SUBSTR(col1, 0, 1) = 'f'
);
SELECT ...
FROM yourdataset.yourJoinedTable;https://stackoverflow.com/questions/61692099
复制相似问题