文章/答案/技术大牛

发布

社区首页 >问答首页 >使用漂亮的汤进行Web抓取(体育数据)

问使用漂亮的汤进行Web抓取(体育数据)
EN

Stack Overflow用户

提问于 2019-04-16 19:06:21

回答 3查看 319关注 0票数 0

当我尝试加载这段代码时，我得到了两个错误。1:第一个问题是我不能正确地抓取name_text的数据。

2:我得到了team = name_text.div.text的缩进错误。我知道这可能很容易解决，但我尝试了不同的缩进，似乎都不起作用。

在网站上，我想刮掉球队的名字和赔率。

<div class="size14_f7opyze Endeavour_fhudrb0 medium_f1wf24vo participantText_fivg86r" data-automation-id="participant-one">Orlando Magic</div>
<div class="priceText_f71sibe"><span class="size14_f7opyze medium_f1wf24vo priceTextSize_frw9zm9" data-automation-id="price-text">5.85</span></div>

上面的html是从该网站复制的。

from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

soup = BeautifulSoup(page_html, "html.parser")

price_text = soup.findAll("div",{"class":"priceText_f71sibe"})
name_text = soup.findAll("div",{"class":"size14_f7opyze Endeavour_fhudrb0 medium_f1wf24vo participantText_fivg86r"})
filename = "odds.csv"
f = open(filename,"w")
headers = "Team, odds_team\n"
print(name_text)
f.write(headers)

for price_text in price_texts:
team = name_text.div.text
odds = price_text.span.text

print(odds)
print(team + odds)
f.write(team + "," + odds + "\n")
f.close()

任何帮助都是最好的。干杯。

beautifulsoup

python

web-scraping

回答 3

Stack Overflow用户

发布于 2019-04-16 19:14:32

您的for loop缩进不正确。正确的缩进应该是：

for price_text in price_texts:
    team = name_text.div.text
    odds = price_text.span.text
    team = name_text.div.text
    odds = price_text.span.text

    print(odds)
    print(team + odds)
    f.write(team + "," + odds + "\n")
f.close()

在团队和赔率之前有4个空位。请阅读Python ForLoop documentation。

此外，也没有price_texts变量。当你做findAll时，你需要分配它，你忘记了一个'S'：

price_texts = soup.findAll("div",{"class":"priceText_f71sibe"})

最后，考虑使用with而不是open()和.close()来写入您的文件。

票数 1

Stack Overflow用户

发布于 2019-04-16 19:21:33

我在想，你可以做的就是迭代并将它们存储到列表中，然后写入文件。不幸的是，我不能在工作时访问这个网站，所以我不能测试代码，但我相信这应该会给出你正在寻找的输出：

from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
import csv
from itertools import zip_longest

my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

soup = BeautifulSoup(page_html, "html.parser")

price_text = soup.findAll("span",{"data-automation-id":"price-text"})
name_text = soup.findAll("div",{"data-automation-id":"participant-one"})

team_list = [ name.text.strip() for name in name_text ]
odds_list = [ price.text.strip() for price in price_text ]

d = [team_list, odds_list]
export_data = zip_longest(*d, fillvalue = '')
with open('odds.csv', 'w', encoding="ISO-8859-1", newline='') as myfile:
      wr = csv.writer(myfile)
      wr.writerow(("Team", "odds_team"))
      wr.writerows(export_data)
myfile.close()

票数 0

Stack Overflow用户

发布于 2019-04-16 19:25:04

你能试试这个吗？

from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

soup = BeautifulSoup(page_html, "html.parser")

price_texts = soup.findAll("div",{"class":"priceText_f71sibe"})
name_texts = soup.findAll("div",{"class":"size14_f7opyze Endeavour_fhudrb0 medium_f1wf24voparticipantText_fivg86r"})
filename = "odds.csv"
f = open(filename,"w")
headers = "Team, odds_team\n"
print(name_text)
f.write(headers)

odds =''
team=''
for price_text in price_texts:
    odds = price_text.text
for name_text in name_texts:
    team = name_text.text
print(odds)
print(team + odds)
f.write(team + "," + odds + "\n")
f.close()

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/55706731

复制

相似问题

问使用漂亮的汤进行Web抓取(体育数据)
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用漂亮的汤进行Web抓取(体育数据)EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用漂亮的汤进行Web抓取(体育数据)
EN