文章/答案/技术大牛

发布

社区首页 >问答首页 >使用BS4获取值

问使用BS4获取值
EN

Stack Overflow用户

提问于 2018-10-08 16:09:35

回答 1查看 168关注 0票数 1

我正在尝试从我的汤中获取“数据-val”，但是它们都在一个巨大的列表中出现，并且没有像网站上显示的那样在不同的列表/列中格式化。

我知道标题在这里：

<th class="num record drop-3" data-tsorter="data-val">
    <span class="long-points">
     proj. pts.
    </span>
    <span class="short-points">
     pts.
    </span>
   </th>
   <th class="pct" data-tsorter="data-val">
    <span class="full-relegated">
     relegated
    </span>
    <span class="small-relegated">
     rel.
    </span>
   </th>
   <th class="pct" data-tsorter="data-val">
    <span class="full-champ">
     qualify for UCL
    </span>
    <span class="small-champ">
     make UCL
    </span>
   </th>
   <th class="pct sorted" data-tsorter="data-val">
    <span class="drop-1">
     win Premier League
    </span>
    <span class="small-league">
     win league
    </span>
   </th>

这就是我想要的：

url = 'https://projects.fivethirtyeight.com/soccer-predictions/premier-league/'
r = requests.get(url = url)
soup = BeautifulSoup(r.text, "html.parser")
table = soup.find("table", {"class":"forecast-table"})
#print(table.prettify())
for i in table.find_all("td", {"class":"pct"}):
     print(i)

因此，理想情况下，我想要4个列表，其中包含类名，然后是匹配的值

python

beautifulsoup

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-10-08 16:58:10

不完全确定您想要什么特定的cols，但这将获得标记属性中所有带有data-val的科尔：

import requests
from bs4 import BeautifulSoup

url = 'https://projects.fivethirtyeight.com/soccer-predictions/premier-league/'
r = requests.get(url)

soup = BeautifulSoup(r.text, "html.parser")
table = soup.find("table", {"class": "forecast-table"})

team_rows = table.find_all("tr", {"class": "team-row"})

for team in team_rows:
    print("Team name: {}".format(team['data-str']))

    team_data = team.find_all("td")

    for data in team_data:
        if hasattr(data, 'attrs') and 'data-val' in data.attrs:
            print("\t{}".format(data.attrs['data-val']))
    print("\n")

如果我确实正确地理解了您的问题，您将查找最后几个值，这些值在html源代码中是相当无标记的。在这种情况下，您可以尝试简单地查找tag[6]，尽管它当然不是很健壮--但这是html解析，因此“不太健壮”对于imho来说是很常见的。

我在这里要做的是找到所有的团队行(这很容易，因为类名)，然后简单地循环遍历团队行中的所有td标记。

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/52706259

复制

相似问题

问使用BS4获取值
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用BS4获取值EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用BS4获取值
EN