我有个数据
index json_col
1 json_1
2 json_2
...其中每个json_1、jason_2等都是json文件。例如,json_1是
[
{
"origin": "a",
"destination": "b",
"leg": "a->b",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-10-08 18:10:00",
"arrivalTimeZulu": "2022-10-08 22:30:00"
}
]
},
{
"origin": "b",
"destination": "c",
"leg": "b->c",
"flights": [
{
"aircraftType": "73H",
"departureTimeZulu": "2022-10-08 14:51:00",
"arrivalTimeZulu": "2022-10-08 18:07:00"
}
]
},
{
"origin": "c",
"destination": "d",
"leg": "c-d",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-10-08 01:30:00",
"arrivalTimeZulu": "2022-10-08 05:24:00"
}
]
}
]我想对我的json_col中的每个json文件应用这个逻辑。
所以:
期望产出:
[
{
"origin": "a",
"destination": "b",
"leg": "a->b",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-10-08 18:10:00",
"arrivalTimeZulu": "2022-10-08 22:30:00"
}
]
},
{
"origin": "b",
"destination": "c",
"leg": "b->c",
"flights": [
{
"aircraftType": "73H",
"departureTimeZulu": "2022-10-09 14:51:00",
"arrivalTimeZulu": "2022-10-09 18:07:00"
}
]
},
{
"origin": "c",
"destination": "d",
"leg": "c-d",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-10-10 01:30:00",
"arrivalTimeZulu": "2022-10-10 05:24:00"
}
]
}
]发布于 2022-10-16 07:48:57
使用以下玩具dataframe (您的json对象加上用于演示的另一个对象):
import pandas as pd
df = pd.DataFrame(
{
"index": [1, 2],
"json_col": [
[
{
"origin": "a",
"destination": "b",
"leg": "a->b",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-10-08 18:10:00",
"arrivalTimeZulu": "2022-10-08 22:30:00",
}
],
},
{
"origin": "b",
"destination": "c",
"leg": "b->c",
"flights": [
{
"aircraftType": "73H",
"departureTimeZulu": "2022-10-08 14:51:00",
"arrivalTimeZulu": "2022-10-08 18:07:00",
}
],
},
{
"origin": "c",
"destination": "d",
"leg": "c-d",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-10-08 01:30:00",
"arrivalTimeZulu": "2022-10-08 05:24:00",
}
],
},
],
[
{
"origin": "a",
"destination": "b",
"leg": "a->b",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-11-09 18:10:00",
"arrivalTimeZulu": "2022-11-09 21:30:00",
}
],
},
{
"origin": "b",
"destination": "c",
"leg": "b->c",
"flights": [
{
"aircraftType": "73H",
"departureTimeZulu": "2022-11-09 14:51:00",
"arrivalTimeZulu": "2022-11-09 19:07:00",
}
],
},
{
"origin": "c",
"destination": "d",
"leg": "c-d",
"flights": [
{
"aircraftType": "763",
"departureTimeZulu": "2022-11-09 01:30:00",
"arrivalTimeZulu": "2022-11-09 03:24:00",
}
],
},
],
],
}
)下面是使用Pandas 合并、积木和DateOffset实现这一任务的一种方法
for idx in df.index:
# Make a dataframe properly formatted from each json file
temp = pd.concat(
[
pd.DataFrame(df.loc[idx, "json_col"][i]["flights"])
for i in range(len(df.loc[idx, "json_col"]))
],
ignore_index=True,
).astype(
{"departureTimeZulu": "datetime64[ns]", "arrivalTimeZulu": "datetime64[ns]"}
)
# Evaluate number of days to add to each arrival and departure dates
temp["delta"] = (
(temp["departureTimeZulu"] < temp["arrivalTimeZulu"].shift(1))
.astype(int)
.cumsum()
)
# Increment arrival and departure dates
for col in ["departureTimeZulu", "arrivalTimeZulu"]:
temp[col] = temp.apply(
lambda x: x[col] + pd.DateOffset(days=x["delta"]), axis=1
).astype(str)
# Cleanup and get data back in json file
temp = temp.drop(columns="delta")
new_values = temp.to_dict(orient="records")
for i, new_value in enumerate(new_values):
df.loc[idx, "json_col"][i]["flights"] = new_value然后,您可以检查每个json对象是否已被正确修改:
print(df.loc[0, "json_col"])
# Output
[
{
"origin": "a",
"destination": "b",
"leg": "a->b",
"flights": {
"aircraftType": "763",
"departureTimeZulu": "2022-10-08 18:10:00",
"arrivalTimeZulu": "2022-10-08 22:30:00",
},
},
{
"origin": "b",
"destination": "c",
"leg": "b->c",
"flights": {
"aircraftType": "73H",
"departureTimeZulu": "2022-10-09 14:51:00",
"arrivalTimeZulu": "2022-10-09 18:07:00",
},
},
{
"origin": "c",
"destination": "d",
"leg": "c-d",
"flights": {
"aircraftType": "763",
"departureTimeZulu": "2022-10-10 01:30:00",
"arrivalTimeZulu": "2022-10-10 05:24:00",
},
},
]
print(df.loc[1, "json_col"])
# Output
[
{
"origin": "a",
"destination": "b",
"leg": "a->b",
"flights": {
"aircraftType": "763",
"departureTimeZulu": "2022-11-09 18:10:00",
"arrivalTimeZulu": "2022-11-09 21:30:00",
},
},
{
"origin": "b",
"destination": "c",
"leg": "b->c",
"flights": {
"aircraftType": "73H",
"departureTimeZulu": "2022-11-10 14:51:00",
"arrivalTimeZulu": "2022-11-10 19:07:00",
},
},
{
"origin": "c",
"destination": "d",
"leg": "c-d",
"flights": {
"aircraftType": "763",
"departureTimeZulu": "2022-11-11 01:30:00",
"arrivalTimeZulu": "2022-11-11 03:24:00",
},
},
]https://stackoverflow.com/questions/74018441
复制相似问题