首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >KeyError:“列中没有[‘索引’]”

KeyError:“列中没有[‘索引’]”
EN

Stack Overflow用户
提问于 2021-05-26 13:56:06
回答 1查看 540关注 0票数 0

下面是一个json文件:

代码语言:javascript
复制
{
    "id": "68af48116a252820a1e103727003d1087cb21a32",
    "article": [
        "by mark duell .",
        "published : .",
        "05:58 est , 10 september 2012 .",
        "| .",
        "updated : .",
        "07:38 est , 10 september 2012 .",
        "a pet owner starved her two dogs so badly that one was forced to eat part of his mother 's dead body in a desperate attempt to survive .",
        "the mother died a ` horrendous ' death and both were in a terrible state when found after two weeks of starvation earlier this year at the home of katrina plumridge , 31 , in grimsby , lincolnshire .",
        "the barely-alive dog was ` shockingly thin ' and the house had a ` nauseating and overpowering ' stench , grimsby magistrates court heard .",
        "warning : graphic content .",
        "horrendous : the male dog , scrappy -lrb- right -rrb- , was so badly emaciated that he ate the body of his mother ronnie -lrb- centre -rrb- to try to survive at the home of katrina plumridge in grimsby , lincolnshire .",
        "the suffering was so serious that the female staffordshire bull terrier , named ronnie , died of starvation , nigel burn , prosecuting , told the court last friday .",
        "suspended jail term : the dogs were in a terrible state when found after two weeks of starvation at the home of katrina plumridge , 31 -lrb- pictured -rrb- .",
        "the male dog , her son scrappy , was so badly emaciated that he ate her body to try to survive .",
    ],
    "abstract": [
        "neglect by katrina plumridge saw staffordshire bull terrier ronnie die .",
        "dog 's son scrappy was forced to eat her to survive at grimsby house .",
        "alarm raised by letting agent shocked by ` thinnest dog he 'd ever seen '",
    ]
}

我已经运行了df = pd.read_json('100252.json'),但是我得到了错误:ValueError: arrays must all be same length

然后我试着

代码语言:javascript
复制
with open('100252.json') as json_data: 
    data = json.load(json_data) 

pd.DataFrame.from_dict(data, orient='index').T.set_index('index')

但是我得到了错误KeyError: "None of ['index'] are in the columns"

我怎么才能解决这个问题?我不知道我的错误是哪里来的。所以我需要你的帮助

编辑

来源:https://huggingface.co/docs/datasets/loading_datasets.html

在这个网站上,我想做一些类似的事情

代码语言:javascript
复制
>>> from datasets import Dataset
>>> import pandas as pd
>>> df = pd.DataFrame({"a": [1, 2, 3]})
>>> dataset = Dataset.from_pandas(df)

我必须将json文件转换为dataframe,然后使用数据集库从熊猫获取数据集。

EN

回答 1

Stack Overflow用户

发布于 2021-05-26 15:29:11

Dataset输入必须是大小相同的dict作为值.所以,

  1. 将句子连接成一个字符串,并创建一个单元素列表.

代码语言:javascript
复制
from datasets import Dataset
with open('100252.json') as json_data: 
    data = json.load(json_data)

data['id'] = [data['id']]
data['article'] = ["\n".join(data['article'])]
data['abstract'] = ["\n".join(data['abstract'])]

Dataset.from_dict(data)

您的数据集将包含一行。

  1. 对齐列表。例如,用空字符串填充

代码语言:javascript
复制
max_len = max([len(data[col]) for col in ['article', 'abstract'] ])

data['id'] = [data['id']] * max_len
data['article'] = data['article'] + [""] * (max_len - len(data['article'])) 
data['abstract'] = data['abstract'] + [""] * (max_len - len(data['abstract'])) 
Dataset.from_dict(data)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/67706435

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档