AI平台对表格数据集的输出如下所示:

或
{
"classes": ["a","b","c"],
"scores": [0.9,0.1,0.0]
}在一个记录字段中有两个数组。predicted_label.classes是标签,predicted_label.scores是人工智能平台产生的分数。
我想选择类,基于最高的评分。在上面的例子中,我希望有一个类似于row=0, class="a", score=0.9的输出
根据我的理解,UNNEST不会立即解决我的问题,因为它要求输入是一个数组。我相信如果输出是重复记录的,那就容易多了。
什么SQL查询将使我能够从AI平台批处理结果中提取正确的标签?
发布于 2021-02-09 11:30:07
试试这个:
with testdata as (
select struct(["a", "b", "c"] as classes, [0.9, 0.1, 0.0] as scores) as predicted_label
)
select (
select struct(offset, class, score)
from unnest(predicted_label.classes) as class with offset
join unnest(predicted_label.scores) as score with offset
using (offset)
order by score desc
limit 1
) as highest
from testdata

发布于 2021-02-09 17:57:10
您应该设计您的预测列表,以便将每个标签和分数表示为键值对。
这个BigQuery表看起来像这个数组。
prediction RECORD REPEATED
prediction.label STRING REQUIRED
prediction.score FLOAT REQUIRED为什么?
SQL示例
with this_model as (
select [
STRUCT ('a' as label, 0.9 as score)
, STRUCT ('b' as label, 0.1 as score)
, STRUCT ('c' as label, 0.0 as score)
] as prediction
)
select pair.label, pair.score
from this_model, UNNEST(prediction) pair
order by pair.score desc
limit 1;https://stackoverflow.com/questions/66117514
复制相似问题