文章/答案/技术大牛

发布

社区首页 >问答首页 >如何将列表中的文件列表保存为python中的json文件？

问如何将列表中的文件列表保存为python中的json文件？
EN

Stack Overflow用户

提问于 2022-11-10 13:52:43

回答 2查看 33关注 0票数 1

我试图用python中的漂亮so解析网站上的数据，最后我从网站中提取数据，所以我想将数据保存在json文件中，但是它按照我编写的代码保存数据如下。

json文件

[
    {
        "collocation": "\nabove average",
        "meaning": "more than average, esp. in amount, age, height, weight etc. "
    },
    {
        "collocation": "\nabsolutely necessary",
        "meaning": "totally or completely necessary"
    },
    {
        "collocation": "\nabuse drugs",
        "meaning": "to use drugs in a way that's harmful to yourself or others"
    },
    {
        "collocation": "\nabuse of power",
        "meaning": "the harmful or unethical use of power"
    },
    {
        "collocation": "\naccept (a) defeat",
        "meaning": "to accept the fact that you didn't win a game, match, contest, election, etc."
    },

我的代码：

import requests
from bs4 import BeautifulSoup
from selenium import webdriver
import pandas as pd
import json


url = "https://www.englishclub.com/ref/Collocations/"

mylist = [
        "A",
        "B",
        "C",
        "D",
        "E",
        "F",
        "G",
        "H",
        "I",
        "J",
        "K",
        "L",
        "M",
        "N",
        "O",
        "P",
        "Q",
        "R",
        "S",
        "T",
        "U",
        "V",
        "W"
]


list = []


for i in range(23):
    result = requests.get(url+mylist[i]+"/", headers=headers)
    doc = BeautifulSoup(result.text, "html.parser")
    collocations = doc.find_all(class_="linklisting")

    for tag in collocations:
            case = {
                    "collocation": tag.a.string,
                    "meaning": tag.div.string
            }
            list.append(case)


with open('data.json', 'w', encoding='utf-8') as f:

    json.dump(list, f, ensure_ascii=False, indent=4)

但是，例如，我想为每个字母有一个列表，例如，A的一个列表和B的多一个列表，这样我就可以很容易地找到哪个字母开头并使用它。我怎么能这么做。正如您在json文件中所看到的，在配置的开头总是有\，我如何删除它？

json

beautifulsoup

python

回答 2

Stack Overflow用户

回答已采纳

发布于 2022-11-10 14:08:16

import requests
from bs4 import BeautifulSoup
import pandas as pd
import json


url = "https://www.englishclub.com/ref/Collocations/"

mylist = [
        "A",
        "B",
        "C",
        "D",
        "E",
        "F",
        "G",
        "H",
        "I",
        "J",
        "K",
        "L",
        "M",
        "N",
        "O",
        "P",
        "Q",
        "R",
        "S",
        "T",
        "U",
        "V",
        "W"
]

#you can use dictionary instead list. suits your needs better
list = {}

#just for quick testing, i set range to 4
for i in range(4):
    list[mylist[i]] = [] #make an empty list for your collocations

    result = requests.get(url+mylist[i]+"/")
    doc = BeautifulSoup(result.text, "html.parser")
    collocations = doc.find_all(class_="linklisting")

    for tag in collocations:
            
            case = {
                    "collocation": tag.a.string.replace("\n",""),#replace \n indentations
                    "meaning": tag.div.string
            }
            list[mylist[i]].append(case)#add collocation to related list


with open('data.json', 'w', encoding='utf-8') as f:

    json.dump(list, f, ensure_ascii=False, indent=4)

我已经为更改的部分写了一篇评论。我们为字典中的每一个字母创建了一个数组。因此，在将来的使用中，您只需使用键就可以获得它们，而不必担心索引。

但是，这是输出

{
    "A": [
        {
            "collocation": "above average",
            "meaning": "more than average, esp. in amount, age, height, weight etc. "
        },
        {
            "collocation": "absolutely necessary",
            "meaning": "totally or completely necessary"
        }
    ],
    "B": [
        {
            "collocation": "back pay",
            "meaning": "money a worker earned in the past but hasn't been paid yet  "
        },
        {
            "collocation": "back road",
            "meaning": "a small country road "
        },
        {
            "collocation": "back street",
            "meaning": "a street in a town or city that's away from major roads or central areas"
        }
    ],
    "C": [
        {
            "collocation": "call a meeting",
            "meaning": "to order or invite people to hold a meeting"
        },
        {
            "collocation": "call a name",
            "meaning": "to say somebody's name loudly"
        },
        {
            "collocation": "call a strike",
            "meaning": "to decide that workers will protest by not going to work "
        }
    ],
    "D": [
        {
            "collocation": "daily life",
            "meaning": "life as experienced from day to day"
        },
        {
            "collocation": "dead ahead",
            "meaning": "straight ahead"
        },
        {
            "collocation": "dead body",
            "meaning": "corpse, or the body of someone who's died"
        }
    ]
}

票数 1

Stack Overflow用户

发布于 2022-11-10 14:08:52

在您的循环中，在定义doc之后，尝试以下操作：

for col in doc.select('div.linklisting'):
    print(print(col.select_one('h3 a').text.strip(), "--", col.select_one('div.linkdescription').text))

例如，对于字母B，它应该输出：

back pay -- money a worker earned in the past but hasn't been paid yet  
back road -- a small country road 
back street -- a street in a town or city that's away from major roads or central areas

等等，您可以将输出元素分配给CSV、dataframe或其他任何东西。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/74390165

复制

相似问题

问如何将列表中的文件列表保存为python中的json文件？
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将列表中的文件列表保存为python中的json文件？EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何将列表中的文件列表保存为python中的json文件？
EN