我从Shopify导入的order在BigQuery中为每个订单创建了一个新条目,如果自上次导入以来发生了一些变化,那么您就可以看到订单属性是如何随时间而变化的,而不仅仅是最后一个导入状态。这还会在表中为相同的顺序创建多个条目,其中唯一的部分是_sdc_batched_at和sdc_sequence值。我有时会看到多达30条相同订单的条目。
表模式..。
order:
order_number: Int
fulfillments: Array
_sdc_batched_at: DateTime
_sdc_sequence: Int我所做的..。
我已经创建了一个分区表,它本质上可以归结为给定日期范围之间的条目子集,以及满意度>0的条目子集。
初始查询以减少数据集.
with orders as (
select order_number, fulfillments, _sdc_batched_at, _sdc_sequence
from `project.shopify.orders`
where created_at between '2018-11-08' and '2018-11-15'
and ARRAY_LENGTH(fulfillments) > 0
)问题是..。我遇到了一些问题,试图使用不同的或分组的,因为实践是一个数组,这会抛出一些东西。如何编写只按_sdc_batched_at值返回最新订单条目的查询?
样本数据
[
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 02:46:21.270 UTC",
"_sdc_sequence": "1541817507934"
},
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 03:16:16.606 UTC",
"_sdc_sequence": "1541819139795"
},
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 03:46:12.704 UTC",
"_sdc_sequence": "1541821046476"
},
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:16:07.952 UTC",
"_sdc_sequence": "1541822755508"
},
{
"order_number": "2212",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 03:46:12.704 UTC",
"_sdc_sequence": "1541821046476"
},
{
"order_number": "2212",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:1:07.952 UTC",
"_sdc_sequence": "1541822755508"
}
]预期结果
仅按_sdc_batched_at值返回最新条目
{
"order_number": "5545",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:16:07.952 UTC",
"_sdc_sequence": "1541822755508"
},
{
"order_number": "2212",
"fulfillments": [
{
"tracking_url": null,
"id": "617029074993",
"tracking_company": "ups",
"tracking_number": "Z1234567890"
}
],
"_sdc_batched_at": "2018-11-10 04:1:07.952 UTC",
"_sdc_sequence": "1541822755508"
}发布于 2018-11-15 22:42:27
下面是用于BigQuery标准SQL的
SELECT AS VALUE ARRAY_AGG(t ORDER BY _sdc_batched_at DESC LIMIT 1)[OFFSET(0)]
FROM `project.shopify.orders` t
GROUP BY order_number 显然,您可以为WHERE子句添加所有所需的
https://stackoverflow.com/questions/53328631
复制相似问题