首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用漂亮的汤进行Web抓取(体育数据)

使用漂亮的汤进行Web抓取(体育数据)
EN

Stack Overflow用户
提问于 2019-04-16 19:06:21
回答 3查看 319关注 0票数 0

当我尝试加载这段代码时,我得到了两个错误。1:第一个问题是我不能正确地抓取name_text的数据。

2:我得到了team = name_text.div.text的缩进错误。我知道这可能很容易解决,但我尝试了不同的缩进,似乎都不起作用。

在网站上,我想刮掉球队的名字和赔率。

代码语言:javascript
复制
<div class="size14_f7opyze Endeavour_fhudrb0 medium_f1wf24vo participantText_fivg86r" data-automation-id="participant-one">Orlando Magic</div>
<div class="priceText_f71sibe"><span class="size14_f7opyze medium_f1wf24vo priceTextSize_frw9zm9" data-automation-id="price-text">5.85</span></div>

上面的html是从该网站复制的。

代码语言:javascript
复制
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

soup = BeautifulSoup(page_html, "html.parser")

price_text = soup.findAll("div",{"class":"priceText_f71sibe"})
name_text = soup.findAll("div",{"class":"size14_f7opyze Endeavour_fhudrb0 medium_f1wf24vo participantText_fivg86r"})
filename = "odds.csv"
f = open(filename,"w")
headers = "Team, odds_team\n"
print(name_text)
f.write(headers)

for price_text in price_texts:
team = name_text.div.text
odds = price_text.span.text

print(odds)
print(team + odds)
f.write(team + "," + odds + "\n")
f.close()

任何帮助都是最好的。干杯。

EN

回答 3

Stack Overflow用户

发布于 2019-04-16 19:14:32

您的for loop缩进不正确。正确的缩进应该是:

代码语言:javascript
复制
for price_text in price_texts:
    team = name_text.div.text
    odds = price_text.span.text
    team = name_text.div.text
    odds = price_text.span.text

    print(odds)
    print(team + odds)
    f.write(team + "," + odds + "\n")
f.close()

在团队和赔率之前有4个空位。请阅读Python ForLoop documentation

此外,也没有price_texts变量。当你做findAll时,你需要分配它,你忘记了一个'S':

代码语言:javascript
复制
price_texts = soup.findAll("div",{"class":"priceText_f71sibe"})

最后,考虑使用with而不是open().close()来写入您的文件。

票数 1
EN

Stack Overflow用户

发布于 2019-04-16 19:21:33

我在想,你可以做的就是迭代并将它们存储到列表中,然后写入文件。不幸的是,我不能在工作时访问这个网站,所以我不能测试代码,但我相信这应该会给出你正在寻找的输出:

代码语言:javascript
复制
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
import csv
from itertools import zip_longest

my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

soup = BeautifulSoup(page_html, "html.parser")

price_text = soup.findAll("span",{"data-automation-id":"price-text"})
name_text = soup.findAll("div",{"data-automation-id":"participant-one"})

team_list = [ name.text.strip() for name in name_text ]
odds_list = [ price.text.strip() for price in price_text ]

d = [team_list, odds_list]
export_data = zip_longest(*d, fillvalue = '')
with open('odds.csv', 'w', encoding="ISO-8859-1", newline='') as myfile:
      wr = csv.writer(myfile)
      wr.writerow(("Team", "odds_team"))
      wr.writerows(export_data)
myfile.close()
票数 0
EN

Stack Overflow用户

发布于 2019-04-16 19:25:04

你能试试这个吗?

代码语言:javascript
复制
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

soup = BeautifulSoup(page_html, "html.parser")

price_texts = soup.findAll("div",{"class":"priceText_f71sibe"})
name_texts = soup.findAll("div",{"class":"size14_f7opyze Endeavour_fhudrb0 medium_f1wf24voparticipantText_fivg86r"})
filename = "odds.csv"
f = open(filename,"w")
headers = "Team, odds_team\n"
print(name_text)
f.write(headers)

odds =''
team=''
for price_text in price_texts:
    odds = price_text.text
for name_text in name_texts:
    team = name_text.text
print(odds)
print(team + odds)
f.write(team + "," + odds + "\n")
f.close()
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/55706731

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档