我有一个json文件,如下嵌套的父-子关系。典型的经理及其员工的组织数据等。
员工可以管理一个或多个员工,而这些员工可以管理其他员工。筑巢可以达到N级深度。
输入数据:
{
"employee_id": "e1",
"employee_name": "employee name 1",
"join_date": "2011-01-01",
" manages ": [
{
" employee_id ": " e11 ",
" employee_name ": " employee name 11 ",
" join_date ": " 2011 - 02 - 01 "
},
{
" employee_id ": " e12 ",
" employee_name ": " employee name 12 ",
" join_date ": " 2011 - 02 - 02 ",
" manages ": [
{
" employee_id ": " e121 ",
" employee_name ": " employee name 121 ",
" join_date ": " 2011 - 02 - 21 "
}
]
}
]
}我想将它们加载到一个结构化的数据框架中。数据帧应该有经理的id与相应的雇员id相关联。
在这里,Manager_id是父节点的“雇员ID”。
预期输出:

任何帮助或建议都将不胜感激。提前谢谢。
发布于 2022-11-06 07:11:08
下面的json文件扩展了您的文件,用于演示:
{
"employee_id": "e1",
"employee_name": "employee name 1",
"join_date": "2011-01-01",
" manages ": [
{
" employee_id ": " e11 ",
" employee_name ": " employee name 11 ",
" join_date ": " 2011 - 02 - 01 "
},
{
" employee_id ": " e12 ",
" employee_name ": " employee name 12 ",
" join_date ": " 2011 - 02 - 02 ",
" manages ": [
{
" employee_id ": " e121 ",
" employee_name ": " employee name 121 ",
" join_date ": " 2011 - 02 - 21 ",
" manages ": [
{
" employee_id ": " e1211 ",
" employee_name ": " employee name 1211 ",
" join_date ": " 2018 - 08 - 15 "
}
]
}
]
},
{
" employee_id ": " e13 ",
" employee_name ": " employee name 13 ",
" join_date ": " 2011 - 09 - 09 ",
" manages ": [
{
" employee_id ": " e131 ",
" employee_name ": " employee name 131 ",
" join_date ": " 2014 - 04 - 7 "
}
]
}
]
}您可以定义以下递归函数:
def func(data, new_data):
"""Recursive helper function.
Args:
data: input json data.
new_data: empty target dict.
Returns:
reshaped data as dict.
"""
data = {key.replace(" ", ""): value for key, value in data.items()}
new_data["employee_id"].append(data["employee_id"])
new_data["employee_name"].append(data["employee_name"])
new_data["join_date"].append(data["join_date"])
if data.get("manages", None):
for item in data["manages"]:
new_data["manager_id"].append(data["employee_id"])
func(item, new_data)
return new_data然后:
import json
import pandas as pd
with open("file.json") as f:
data = json.load(f)
# Apply recursive func to json
df = pd.DataFrame(
func(
data,
{"employee_id": [], "employee_name": [], "join_date": [], "manager_id": [None]},
)
)
# Cleanup
df = df.apply(
lambda x: x.str.replace(" - ", "-").str.rstrip(" ").str.lstrip(" ")
).fillna("")因此:
print(df)
# Output
0 e1 employee name 1 2011-01-01
1 e11 employee name 11 2011-02-01 e1
2 e12 employee name 12 2011-02-02 e1
3 e121 employee name 121 2011-02-21 e12
4 e1211 employee name 1211 2018-08-15 e121
5 e13 employee name 13 2011-09-09 e1
6 e131 employee name 131 2014-04-7 e13https://stackoverflow.com/questions/74297962
复制相似问题