首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何将一个单元格中有多个值的csv转换为python中的嵌套json

如何将一个单元格中有多个值的csv转换为python中的嵌套json
EN

Stack Overflow用户
提问于 2020-01-30 12:18:52
回答 1查看 275关注 0票数 0

我有一个csv文件,在一个单元格中有多个值,格式如下:

代码语言:javascript
复制
ID, Name,      Role,               Task,                       Responsibility
123, Stephen, "1. Give, 2. Take", "1.1. DO, 1.2. AB, 2.1. DF", "1.1.1. FG, 1.1.2. GH, 1.2.1. SG, 2.1.1. DF, 2.1.2. JK"

为了提高可读性,我添加了一些空格。我需要将这个csv文件转换为嵌套的json格式,如下所示:

代码语言:javascript
复制
{
    "Name" : "Stephen",
    "123": {
        "1": {
            "Role": "Give",
            "1.1": {
                "Task": "DO",
                "1.1.1": {
                    "Responsibility": "FG"
                },
                "1.1.2": {
                    "Responsibility": "GH"
                }
            },
            "1.2": {
                "Task": "AB",
                "1.2.1": {
                    "Responsibility": "SG"
                }
            }
        },
        "2": {
            "Role": "Take",
            "2.1": {
                "Task": "DF",
                "2.1.1": {
                    "Responsibility": "DF"
                },
                "2.1.2": {
                    "Responsibility": "JK"
                }
            }
        }
    }
}

数字是这样的1,1.1,1.2.1,2.2,2.3,2.3.1。我需要一个来检测这样的单元格(或这样的列类型),并将其转换为上面这样的key:value对。

EN

回答 1

Stack Overflow用户

发布于 2020-01-30 12:42:39

您可以对itertools.groupby使用递归

代码语言:javascript
复制
from itertools import groupby as gb
def to_dict(data):
   d = [(a, list(b)) for a,b in gb(sorted(data, key=lambda x:x[0][0]), key=lambda x:x[0][0])]
   return {b[0][1]:{**b[0][-1], **to_dict([[j, k, l] for [_, *j], k, l in b if j])} for a,b in d}

代码语言:javascript
复制
import re, json
s = """
ID, Name,      Role,               Task,                       Responsibility
123, Stephen, "1. Give, 2. Take", "1.1. DO, 1.2. AB, 2.1. DF", "1.1.1. FG, 1.1.2. GH, 1.2.1. SG, 2.1.1. DF, 2.1.2. JK"
"""
#below: parse desired values from data and format header
[h1, h2, *h], [_id, n, *_data] = [re.findall('(?<=")[^"]+|\w+', i) for i in filter(None, s.split('\n'))]
#transform numerical paths as lists
data = [[b.split('. ') for b in i.split(', ')] for i in _data if i != ', ']
#associate original file headers to the transformed data
formed = [l for a, b in zip(h, data) for l in [[c.split('.'), c, {a:d}] for c, d in b]]
print(json.dumps({h2:n, h1:to_dict(formed)}, indent=4)) 

输出:

代码语言:javascript
复制
{
  "Name": "Stephen",
  "ID": {
    "1": {
        "Role": "Give",
        "1.1": {
            "Task": "DO",
            "1.1.1": {
                "Responsibility": "FG"
            },
            "1.1.2": {
                "Responsibility": "GH"
            }
        },
        "1.2": {
            "Task": "AB",
            "1.2.1": {
                "Responsibility": "SG"
            }
        }
    },
    "2": {
        "Role": "Take",
        "2.1": {
            "Task": "DF",
            "2.1.1": {
                "Responsibility": "DF"
            },
            "2.1.2": {
                "Responsibility": "JK"
            }
        }
      }
   }
}
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/59979008

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档