首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用bs4进行网络抓取

使用bs4进行网络抓取
EN

Stack Overflow用户
提问于 2016-09-03 02:01:04
回答 1查看 79关注 0票数 0
代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup

url = "http://bet.hkjc.com/football/index.aspx?lang=en"
r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")

div = soup.find("div", {"class": "footballmaincontent"})
tables = div.find_all("table")
my_table = tables[2]

for row in my_table.find_all('tr'):
    cols = row.find_all('td')

    odds_list = []
    if len(cols) >= 10:
        match_no = (cols[0].text.strip())
        teams = (cols[2].text.strip())
        match_time = (cols[4].text.strip())
        home_odds = (cols[7].text.strip())
        away_odds = (cols[8].text.strip())
        draw_odds = (cols[9].text.strip())

        odds_row = (match_no,teams,match_time,home_odds,away_odds,draw_odds)
        odds_list.append(odds_row)

# Write to csv file
import csv
with open("odds_file.csv", "wb") as file:
    writer = csv.writer(file)
    for row in odds_list:
        writer.writerow(row)

我尝试将列导出到csv文件中,方法是将它们附加到for循环中的"odds_list“中。但事实证明,它并没有在"odds_file“中写任何东西。

我知道有些地方不对劲

代码语言:javascript
复制
odds_row = (match_no,teams,match_time,home_odds,away_odds,draw_odds)

但是,如何将我创建的列表附加到csv文件中呢?

EN

回答 1

Stack Overflow用户

发布于 2016-09-03 02:27:19

您有my_table,所以使用findfind_allmy_table来获取<tr>,然后再使用<td>,然后您可以从<td>获取text

编辑:

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup

url = "http://bet.hkjc.com/football/index.aspx?lang=en"
r = requests.get(url)

soup = BeautifulSoup(r.content, "html.parser")

div = soup.find("div", {"class": "footballmaincontent"})
tables = div.find_all("table")
my_table = tables[2]

for row in my_table.find_all('tr'):
    cols = row.find_all('td')
    if len(cols) >= 10:
        print(cols[0].text.strip(),'|',end='')
        print(cols[2].text.strip(),'|',end='')
        print(cols[4].text.strip(),'|',end='')
        print(cols[7].text.strip(),'|',end='')
        print(cols[8].text.strip(),'|',end='')
        print(cols[9].text.strip(),'|',end='')
        print()
        print('-'*40)

结果

代码语言:javascript
复制
Match No. |Teams(Home vs Away) |Expected StopSelling Time |Home/Away/Draw | | |
----------------------------------------
FRI 9 |Romania U21 vs Luxembourg U21 |03/09 01:30 |Accept In Play Betting Only | | |
----------------------------------------
FRI 13 |St. Vincent and Grenadines vs USA |03/09 03:30 |35.00 |13.00 |1.02 |
----------------------------------------
FRI 14 |Honduras vs Canada |03/09 05:06 |1.45 |3.55 |6.50 |
----------------------------------------
FRI 15 |Trinidad and Tobago vs Guatemala |03/09 07:00 |1.67 |3.20 |4.70 |
----------------------------------------
票数 -1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/39298340

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档