首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何使用Python正确解析嵌套的json,其中几个键相等

如何使用Python正确解析嵌套的json,其中几个键相等
EN

Stack Overflow用户
提问于 2020-02-02 22:07:13
回答 1查看 83关注 0票数 1

我对flatten / json_normalize函数有问题。有一个嵌套的json,里面有6个“收据”,但是把这个json压平,只给我一排一张收据,这也是最后一张,我需要我的熊猫数据中的所有6条。

代码语言:javascript
复制
[
  {
    "_index": "packets-2020-02-03",
    "_type": "receipts_file",
    "_score": null,
    "_source": {
      "layers": {
        "frame": {
          "frame.encap_type": "25",
          "frame.time": "Feb  3, 2019 00:17:14.004011000 MSK",
          "frame.offset_shift": "0.000000000",
          "frame.time_epoch": "2575325034.004011000",
          "frame.time_delta": "0.002843000",
          "frame.time_delta_displayed": "0.002843000",
          "frame.time_relative": "0.002852000",
          "frame.number": "4",
          "frame.len": "1294",
          "frame.cap_len": "1294",
          "frame.marked": "0",
          "frame.ignored": "0",
          "frame.protocols": "several"
        },
        "receipts": {
          "receipts.command_length": "238",
          "receipts.command_id": "0x00000005",
          "receipts.sequence_number": "47207",
          "receipts.data_coding": "0x00000000",
          "receipts.data_coding_tree": {
            "receipts.rps": "0x00000000",
            "Receipt Type 1 Data Coding": {
              "receipts.rps.rc_coding_group": "0x00000000",
              "receipts.rps.text_compression": "0",
              "receipts.rps.class_present": "0",
              "receipts.rps.charset": "0x00000000"
            },
            "Receipt Type 2 Data Coding": {
              "receipts.rps.rpk._coding_group": "0x00000000",
              "receipts.rps.rpk._language": "0x00000000"
            }
          },
          "receipts.rc_default_receipt_id": "0",
          "receipts.rc_length": "117",
          "receipts.receipt": "29831",
          "receipts.opt_params": {
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003002",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "47912"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003001",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "98982"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003004",
              "receipts.opt_param_len": "1",
              "receipts.vendor_op": "00"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003000",
              "receipts.opt_param_len": "4",
              "receipts.vendor_op": "23080"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003003",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "29849"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x0000001e",
              "receipts.opt_param_len": "9",
              "receipts.receipted_receipt_id": "949BB6DE"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00000427",
              "receipts.opt_param_len": "1",
              "receipts.receipt_state": "2"
            }
          }
        },
        "receipts": {
          "receipts.command_length": "241",
          "receipts.command_id": "0x00000005",
          "receipts.sequence_number": "47208",
          "receipts.data_coding": "0x00000000",
          "receipts.data_coding_tree": {
            "receipts.rps": "0x00000000",
            "Receipt Type 1 Data Coding": {
              "receipts.rps.rc_coding_group": "0x00000000",
              "receipts.rps.text_compression": "0",
              "receipts.rps.class_present": "0",
              "receipts.rps.charset": "0x00000000"
            },
            "Receipt Type 2 Data Coding": {
              "receipts.rps.rpk._coding_group": "0x00000000",
              "receipts.rps.rpk._language": "0x00000000"
            }
          },
          "receipts.rc_default_receipt_id": "0",
          "receipts.rc_length": "117",
          "receipts.receipt": "98341",
          "receipts.opt_params": {
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003002",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "38220"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003001",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "93813"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003004",
              "receipts.opt_param_len": "1",
              "receipts.vendor_op": "00"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003000",
              "receipts.opt_param_len": "4",
              "receipts.vendor_op": "98381"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003003",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "77371"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x0000001e",
              "receipts.opt_param_len": "9",
              "receipts.receipted_receipt_id": "6DED391C"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00000427",
              "receipts.opt_param_len": "1",
              "receipts.receipt_state": "2"
            }
          }
        },
        "receipts": {
          "receipts.command_length": "238",
          "receipts.command_id": "0x00000005",
          "receipts.sequence_number": "47209",
          "receipts.data_coding": "0x00000000",
          "receipts.data_coding_tree": {
            "receipts.rps": "0x00000000",
            "Receipt Type 1 Data Coding": {
              "receipts.rps.rc_coding_group": "0x00000000",
              "receipts.rps.text_compression": "0",
              "receipts.rps.class_present": "0",
              "receipts.rps.charset": "0x00000000"
            },
            "Receipt Type 2 Data Coding": {
              "receipts.rps.rpk._coding_group": "0x00000000",
              "receipts.rps.rpk._language": "0x00000000"
            }
          },
          "receipts.rc_default_receipt_id": "0",
          "receipts.rc_length": "117",
          "receipts.opt_params": {
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003002",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "38717"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003001",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "37788"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003004",
              "receipts.opt_param_len": "1",
              "receipts.vendor_op": "74818"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003000",
              "receipts.opt_param_len": "4",
              "receipts.vendor_op": "77812"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003003",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "39999"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x0000001e",
              "receipts.opt_param_len": "9",
              "receipts.receipted_receipt_id": "273A872F"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00000427",
              "receipts.opt_param_len": "1",
              "receipts.receipt_state": "2"
            }
          }
        },
        "receipts": {
          "receipts.command_length": "242",
          "receipts.command_id": "0x00000005",
          "receipts.sequence_number": "47210",
          "receipts.data_coding": "0x00000000",
          "receipts.data_coding_tree": {
            "receipts.rps": "0x00000000",
            "Receipt Type 1 Data Coding": {
              "receipts.rps.rc_coding_group": "0x00000000",
              "receipts.rps.text_compression": "0",
              "receipts.rps.class_present": "0",
              "receipts.rps.charset": "0x00000000"
            },
            "Receipt Type 2 Data Coding": {
              "receipts.rps.rpk._coding_group": "0x00000000",
              "receipts.rps.rpk._language": "0x00000000"
            }
          },
          "receipts.rc_default_receipt_id": "0",
          "receipts.rc_length": "118",
          "receipts.receipt": "69322",
          "receipts.opt_params": {
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003002",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "83881"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003001",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "73188"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003004",
              "receipts.opt_param_len": "1",
              "receipts.vendor_op": "00"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003000",
              "receipts.opt_param_len": "4",
              "receipts.vendor_op": "78881"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003003",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "74388"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x0000001e",
              "receipts.opt_param_len": "9",
              "receipts.receipted_receipt_id": "949C60DF"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00000427",
              "receipts.opt_param_len": "1",
              "receipts.receipt_state": "2"
            }
          }
        },
        "receipts": {
          "receipts.command_length": "238",
          "receipts.command_id": "0x00000005",
          "receipts.sequence_number": "47211",
          "receipts.data_coding": "0x00000000",
          "receipts.data_coding_tree": {
            "receipts.rps": "0x00000000",
            "Receipt Type 1 Data Coding": {
              "receipts.rps.rc_coding_group": "0x00000000",
              "receipts.rps.text_compression": "0",
              "receipts.rps.class_present": "0",
              "receipts.rps.charset": "0x00000000"
            },
            "Receipt Type 2 Data Coding": {
              "receipts.rps.rpk._coding_group": "0x00000000",
              "receipts.rps.rpk._language": "0x00000000"
            }
          },
          "receipts.rc_default_receipt_id": "0",
          "receipts.rc_length": "117",
          "receipts.receipt": "12281",
          "receipts.opt_params": {
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003002",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "12727"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003001",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "18828"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003004",
              "receipts.opt_param_len": "1",
              "receipts.vendor_op": "00"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003000",
              "receipts.opt_param_len": "4",
              "receipts.vendor_op": "38218"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003003",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "47718"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x0000001e",
              "receipts.opt_param_len": "9",
              "receipts.receipted_receipt_id": "949BD094"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00000427",
              "receipts.opt_param_len": "1",
              "receipts.receipt_state": "2"
            }
          }
        },
        "receipts": {
          "receipts.command_length": "25",
          "receipts.command_id": "0x80000004",
          "receipts.command_status": "0x00000000",
          "receipts.sequence_number": "35572",
          "receipts.receipt_id": "949C23B8"
        }
      }
    }
  }
]

我试着使用这段代码:

代码语言:javascript
复制
import json
import pandas as pd
from flatten_json import flatten

i_file_name = 'example.json'

with open(i_file_name) as fd:
     json_data = json.load(fd)
json_data = (flatten(d, '.') for d in json_data)

df = pd.DataFrame(json_data)

df.head()

代码语言:javascript
复制
import pandas as pd

i_file_name = 'example.json'

df = pd.read_json(i_file_name)
df = pd.json_normalize(df['_source'])

df.head()

他们给了我同样的结果:只有1行,而不是6行。我试图用record_pathmeta设置json_normalize,但是我想不出怎么做。我对json解析有点陌生,在这里找不到类似的问题。我知道我需要设定正确的钥匙,但我不知道该怎么做

编辑:

毫无疑问,StackOverflow对问题表的支持有限,所以我将尝试解释我的预期输出。

现在,我只看到一行列:

  • _index
  • _type
  • _score
  • _source.layers.frame.*
  • _source.source.receipts.*

其中*表示同一级别下有几个列

*只有5栏:

  • command_length
  • command_id
  • command_status
  • sequence_number
  • receipt_id

我得到的1行包含上一次“收据”-level记录中这些列的值:

代码语言:javascript
复制
 "receipts": {
          "receipts.command_length": "25",
          "receipts.command_id": "0x80000004",
          "receipts.command_status": "0x00000000",
          "receipts.sequence_number": "35572",
          "receipts.receipt_id": "949C23B8"
}

但也有其他“收据”-level记录,如:

代码语言:javascript
复制
"receipts": {
          "receipts.command_length": "238",
          "receipts.command_id": "0x00000005",
          "receipts.sequence_number": "47207",
          "receipts.data_coding": "0x00000000",
          "receipts.data_coding_tree": {
            "receipts.rps": "0x00000000",
            "Receipt Type 1 Data Coding": {
              "receipts.rps.rc_coding_group": "0x00000000",
              "receipts.rps.text_compression": "0",
              "receipts.rps.class_present": "0",
              "receipts.rps.charset": "0x00000000"
            },
            "Receipt Type 2 Data Coding": {
              "receipts.rps.rpk._coding_group": "0x00000000",
              "receipts.rps.rpk._language": "0x00000000"
            }
          },
          "receipts.rc_default_receipt_id": "0",
          "receipts.rc_length": "117",
          "receipts.receipt": "29831",
          "receipts.opt_params": {
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003002",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "47912"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003001",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "98982"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003004",
              "receipts.opt_param_len": "1",
              "receipts.vendor_op": "00"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003000",
              "receipts.opt_param_len": "4",
              "receipts.vendor_op": "23080"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00003003",
              "receipts.opt_param_len": "10",
              "receipts.vendor_op": "29849"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x0000001e",
              "receipts.opt_param_len": "9",
              "receipts.receipted_receipt_id": "949BB6DE"
            },
            "receipts.opt_param": {
              "receipts.opt_param_tag": "0x00000427",
              "receipts.opt_param_len": "1",
              "receipts.receipt_state": "2"
            }
          }
        },

我也想在熊猫的数据栏里看到。所以我现在看到的排应该是第六排。

我有点理解我的json是坏的,因为它有6个不同的键,具有相同的名称(收据),但是也许我可以对它进行不同的解析,这样我就可以正确地将它导入Pandas。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-08-24 09:00:41

我意识到我没有回答我的问题,而是设法解决了这个问题。我为下面的代码道歉,但是如果您想解决这样的问题,可能会有帮助。我已经决定,我宁愿向世界展示我的愚蠢代码,也不愿让它没有任何解决方案。

第一,正如我在质询中所说:

代码语言:javascript
复制
import pandas as pd

i_file_name = 'example.json'

df = pd.read_json(i_file_name)
df = pd.json_normalize(df['_source'])

然后我把它转换成json,然后再把它导入Pandas:

代码语言:javascript
复制
df_json = df.to_json(orient='records')

df = pd.read_json(df_json, orient='columns')

然后我融化了一些图层:

代码语言:javascript
复制
df_melt = pd.melt(df, id_vars=['layers.frame.frame.time',
                          'layers.frame.frame.number'
                       value_vars=['layers.receipts'])

在此之后,我创建了一个新的DataFrame,其中包含了这些熔化的值,并保存了索引,以便以后连接2个数据文件。

代码语言:javascript
复制
df_melt2 = pd.DataFrame(df_melt['value'].values.tolist(), index=df_melt)

然后,我将两个dataframe连接在一起,删除不再需要的列。

代码语言:javascript
复制
df_melt_full = pd.concat([df_melt, df_melt2], axis=1)
df_melt_full = df_melt_full.drop(['value', 'variable'], axis=1)

在那之后,我又融化了它(是的,这是我二月份的代码,我为此感到羞愧)

代码语言:javascript
复制
df_melt_full_melt = pd.melt(df_melt_full, 
                            id_vars=['layers.frame.frame.time',
                                     'layers.frame.frame.number']
                           )

再进口一次

代码语言:javascript
复制
df_normalized = pd.json_normalize(df_melt_full_melt['value'])

最后,我将两个数据文件连接在一起,解决了问题。

代码语言:javascript
复制
df_final = pd.concat([df_melt, df_normalized], axis=1)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60031071

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档