我正在尝试为我的python应用程序将csv数据转换为具有父子关系的Json。csv文件包含Genesymbol和疾病名称。基于基因符号,我将其转换为Child。示例csv文件
gene,disease
A1BG,Adenocarcinoma
A1BG,apnea
A1BG,Athritis
A2M,Asthma
A2M,Astrocytoma
A2M,Diabetes
NAT1,polyps
NAT1,lymphoma
NAT1,neoplasms输出Json包含如下格式。请让我知道任何需要的变化,以获得所需的输出。
{
"name": "A1BG",
"children": [
{"name": "Adenocarcinoma", "size": 2138},
{"name": "apnea", "size": 3824},
{"name": "Athritis", "size": 1353}
]
},
{
"name": "A2M",
"children": [
{"name": "Asthma", "size": 2138},
{"name": "Astrocytoma", "size": 3824},
{"name": "Diabetes", "size": 1353}
]
},
{
"name": "NAT1",
"children": [
{"name": "polyps", "size": 2138},
{"name": "lymphoma", "size": 3824},
{"name": "neoplasms", "size": 1353}
]
}我编写的python代码是
from itertools import groupby
from collections import OrderedDict
import json
df = pd.read_csv('test.csv')
finalList = []
finalDict = {}
grouped = df.groupby(['gene'])
for key, value in grouped:
dictionary = {}
j = grouped.get_group(key).reset_index(drop=True)
dictionary['gene'] = j.at[0, 'gene']
dictList = []
anotherDict = {}
for i in j.index:
anotherDict['disese'] = j.at[i, 'disease']
dictList.append(anotherDict)
dictionary['children'] = dictList
finalList.append(dictionary)
import json
json.dumps(finalList)发布于 2020-09-06 17:58:25
也许可以试试这样的东西:
import json
import pandas as pd
result = (
df
.groupby("gene", as_index=False).agg(list)
.rename(columns={"gene": "name", "disease": "children"})
.to_dict("records")
)
with open('output.json', "w") as out:
json.dump(result, out, indent=4)[
{
"name": "A1BG",
"children": [
"Adenocarcinoma",
"apnea",
"Athritis"
]
},
{
"name": "A2M",
"children": [
"Asthma",
"Astrocytoma",
"Diabetes"
]
},
{
"name": "NAT1",
"children": [
"polyps",
"lymphoma",
"neoplasms"
]
}
]https://stackoverflow.com/questions/63762565
复制相似问题