首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Python BeautifulSoup解析表Yahoo Fantasy足球数据

Python BeautifulSoup解析表Yahoo Fantasy足球数据
EN

Stack Overflow用户
提问于 2015-11-21 01:00:39
回答 2查看 812关注 0票数 3

为了好玩,我试图从我的雅虎梦幻橄榄球联盟中收集一些数据,以便进行球员交易。这是我第一次使用机械化和漂亮的汤,我在打印特定数据时遇到了麻烦。我希望提取的是球员的名字,如果他们被添加到‘豁免’和日期。我能弄到第一部分,但我不确定怎么弄到日期。第一部分是HTML示例,第二部分是我的代码:

代码语言:javascript
复制
        <table class="Table Table-mid Tst-transaction-table">
                <tr>
        <td class="Grid-u-1-12 Ta-c"><span class="F-icon Block Fz-lg F-positive Cur-h" title="Added Player">&#xe035;</span><span class="F-icon Block Fz-lg F-negative Ptop-med Cur-h" title="Dropped Player">&#xe033;</span></td>
        <td class="Fill-x No-pstart" colspan="2">
            <div class="Pbot-xs">        <a href="http://sports.yahoo.com/nfl/players/24963" target=sports onclick="pop(this)">Dwayne Harris</a>
        <span class="F-position Fz-xxs">NYG - WR</span>
        <a href="http://sports.yahoo.com/nfl/players/24963/news" class="yfa-icon playernote playernote-recent" data-ys-playerid="24963" data-ys-playernote-view="notes" target="_blank" id="playernote-'.24963.'"></a>  <h6 class="F-shade Fz-xxs"> Waiver </h6></div>
                    <div class="Pbot-xs">        <a href="http://sports.yahoo.com/nfl/players/6791" target=sports onclick="pop(this)">Benjamin Watson</a>
        <span class="F-position Fz-xxs">NO - TE</span>
        <a href="http://sports.yahoo.com/nfl/players/6791/news" class="yfa-icon playernote playernote-recent" data-ys-playerid="6791" data-ys-playernote-view="notes" target="_blank" id="playernote-'.6791.'"></a>  <h6 class="F-shade Fz-xxs"> To Waivers</h6></div>
        </td>
        <td class="Ta-end">
            <div class="Grid-h-top Nowrap Fz-xxs">
    <span class="Grid-u">
      <a class="Tst-team-name" href="/f1/313652/10">TeamName2</a> 

      <span class="Block F-timestamp Fz-xxs Nowrap">Nov 20, 4:03 am</span>
    </span>
    <a class='Grid-u' href='/f1/313652/10'><img class="Avatar-sm Mstart-med Grid-u" src="http://l.yimg.com/dh/ap/fantasy/nfl/img/icon_01_100.png" alt="avatar"> </a>
</div>
        </td>
    </tr>    <tr>
        <td class="Grid-u-1-12 Ta-c"><span class="F-icon Block Fz-lg F-positive Cur-h" title="Added Player">&#xe035;</span><span class="F-icon Block Fz-lg F-negative Ptop-med Cur-h" title="Dropped Player">&#xe033;</span></td>
        <td class="Fill-x No-pstart" colspan="2">
            <div class="Pbot-xs">        <a href="http://sports.yahoo.com/nfl/players/7306" target=sports onclick="pop(this)">Darren Sproles</a>
        <span class="F-position Fz-xxs">Phi - RB</span>
        <a href="http://sports.yahoo.com/nfl/players/7306/news" class="yfa-icon playernote playernote-recent" data-ys-playerid="7306" data-ys-playernote-view="notes" target="_blank" id="playernote-'.7306.'"></a>  <h6 class="F-shade Fz-xxs">Free Agent </h6></div>
                    <div class="Pbot-xs">        <a href="http://sports.yahoo.com/nfl/players/24262" target=sports onclick="pop(this)">Joique Bell</a>
        <span class="F-position Fz-xxs">Det - RB</span>
         <span class="F-injury Fz-xxs" title="Probable">P</span>
        <a href="http://sports.yahoo.com/nfl/players/24262/news" class="yfa-icon playernote playernote-old" data-ys-playerid="24262" data-ys-playernote-view="notes" target="_blank" id="playernote-'.24262.'"></a>  <h6 class="F-shade Fz-xxs"> To Waivers</h6></div>
        </td>
        <td class="Ta-end">
            <div class="Grid-h-top Nowrap Fz-xxs">
    <span class="Grid-u">
      <a class="Tst-team-name" href="/f1/313652/3">TeamName1</a> 
      <span class="Block F-timestamp Fz-xxs Nowrap">Nov 19, 1:30 pm</span>
    </span>
    <a class='Grid-u' href='/f1/313652/3'><img class="Avatar-sm Mstart-med Grid-u" src="http://l.yimg.com/dh/ap/fantasy/img/profile_48.png" alt="avatar"> </a>
</div>
        </td>

代码:

代码语言:javascript
复制
import mechanize
from bs4 import BeautifulSoup
import urllib

username = 'my-username'
password = 'my-password'

br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.6')]
br.open("https://football.fantasysports.yahoo.com/f1/313652/transactions")
br.select_form(nr=0)
br.form["username"] = username
br.form["passwd"] = password
response = br.submit()
html_scrape = response.read()
soup = BeautifulSoup(html_scrape, "lxml")

for lines in soup.find_all('div', attrs={'class': 'Pbot-xs'}):
    players = lines.find('a').get_text()
    status = lines.find('h6').get_text()
    if (status == ' To Waivers'):
        print "%s was dropped" % players

我尝试在Table上使用find()函数,但我不知道如何获取要查找的文本数据。

谢谢!

EN

回答 2

Stack Overflow用户

发布于 2015-11-21 01:08:44

在不熟悉雅虎梦幻足球页面的情况下,很难给你一个正确的答案,但我可以告诉你,如果你试图将BeautifulSoup指向特定的div,你应该像这样使用BeautifulSoup的select功能:

代码语言:javascript
复制
for lines in soup.select("div#pBot-xs"):
    players = lines.find('a').text
    status = lines.find('h6').text
    if status == 'To Waivers':
        print "%s was dropped." % players
票数 0
EN

Stack Overflow用户

发布于 2015-11-21 03:10:50

这有点棘手,因为在梦幻体育中,你可以删除一个球员,但不一定要添加一个球员。我通过浏览列表并按顺序添加球员姓名和日期来实现这一点。玩家必须匹配“to Waivers”属性。然后,我设置了try/catch块,以确保迭代中的前一个对象具有相应的播放器。这将确保我的字典值是Player > Date > Player > Date等

然后我迭代了字典,并以我想要的方式格式化了打印结果:

代码语言:javascript
复制
import mechanize
from bs4 import BeautifulSoup
import urllib

username = 'username@yahoo.com'
password = 'password'

br = mechanize.Browser()
br.addheaders = [('User-agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.6')]
br.open("https://football.fantasysports.yahoo.com/f1/313652/transactions")
br.select_form(nr=0)
br.form["username"] = username
br.form["passwd"] = password
response = br.submit()
html_scrape = response.read()
soup = BeautifulSoup(html_scrape, "lxml")
index = 1
dropped = {}

for players in soup.select("table > tr > td > div"):
    player = players.find('a').get_text()
    try:
        if (players.find('h6').get_text() == ' To Waivers' ):
            dropped[index] = player
    except AttributeError:
        pass

    time = players.find('span',{'class':"Block F-timestamp Fz-xxs Nowrap"})
    if (time != None):
        try:
            nullplayer = dropped[index - 1]
            time = time.get_text()
            dropped[index] = time
        except KeyError:
            pass
    index += 1

count = 1
for items in dropped:
    if (count % 2 == 0):
        player = dropped[items - 1]
        time = dropped[items]
        print "%s dropped on %s" %(player, time)
    count += 1
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/33832138

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档