首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >试图在此网站上使用python为玩家抓取体育数据。

试图在此网站上使用python为玩家抓取体育数据。
EN

Stack Overflow用户
提问于 2020-09-03 00:47:06
回答 1查看 241关注 0票数 0

下面是我正在使用的网页url:https://www.playerprofiler.com/nfl/christian-mccaffrey/

我正在寻找的数据存储在以下html标记中:<div class='c-metric-table'>

我使用了以下python代码来尝试刮取数据:

代码语言:javascript
复制
import bs4 
from bs4 import BeautifulSoup as soup
my_test_url = 'https://www.playerprofiler.com/nfl/christian-mccaffrey/' 
import requests
res = requests.get(my_test_url)
page_soup = soup(res.content, 'html.parser')

每当我使用:

代码语言:javascript
复制
len(page_soup.find_all('div', class_='c-metric-table'))

终端返回0时应该返回5或10 (我不太确定)。

不知道我做错了什么。请帮帮我!

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-09-03 08:20:58

不需要selenium,您可以从api中提取它。很酷的网站,我第一次看到它。

代码语言:javascript
复制
import requests
import pandas as pd

url = 'https://www.playerprofiler.com/wp-admin/admin-ajax.php'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36'}
payload = {
'action': 'playerprofiler_api',
'endpoint': '/player/CM-1225'}

jsonData = requests.get(url, headers=headers, params=payload).json()

data = jsonData['data']['Player']['Performance Metrics']['2019']
teamData = data['Team']

del data['Team']

df = pd.DataFrame([data]).merge(pd.DataFrame([teamData]), how='left', left_index=True, right_index=True)

输出:

代码语言:javascript
复制
print(df.to_string())
  Average Defenders In The Box Average Defenders In The Box Rank Base Front Carry Rate Base Front Carry Rate Rank Base Front Yards Per Carry Base Front Yards Per Carry Rank Big Hits Big Hits Per Game Big Hits Rank Breakaway Run Rate Breakaway Run Rate Rank Breakaway Runs Breakaway Runs Per Game Breakaway Runs Per Game Rank Breakaway Runs Rank Carries Carries Inside 10 Carries Inside 10 Per Game Carries Inside 5 Carries Per Game Carries Per Game Rank Carries Rank CarriesInside 5 Per Game Catch Rate Catch Rate Rank Catchable Target Rate Catchable Target Rate Rank Catchable Targets Dominator Rating Dominator Rating Rank Drop Rate Drop Rate Rank Drops Drops Per Game Drops Per Game Rank Drops Rank Endzone Target Share Endzone Target Share Rank Endzone Targets Evaded Tackles Evaded Tackles Per Game Evaded Tackles Per Game Rank Evaded Tackles Rank Expected Fantasy Points Expected Fantasy Points Per Game Expected Fantasy Points Per Game Rank Expected Fantasy Points Rank Expected Touchdown Differential Expected Touchdowns Fantasy Points Fantasy Points Per Game Fantasy Points Per Game Differential Fantasy Points Per Game Rank Fantasy Points Per Opportunity Fantasy Points Per Opportunity Rank Fantasy Points Per Snap Fantasy Points Per Snap Rank             FullName Game Script Game Script Rank Games Games Played Rank Goal Line Carries Goal Line Carries Per Game Goal Line Carries Per Game Rank Goal Line Carries Rank Juke Rate Juke Rate Rank Light Front Carry Rate Light Front Carry Rate Rank Light Front Yards Per Carry Light Front Yards Per Carry Rank Normalized Fantasy Points Per Game Normalized Fantasy Points Per Game Rank Opportunities Opportunity Share Opportunity Share Rank Pace of Play Pace of Play Rank Pass Routes Pass Routes Per Game Pass Routes Per Game Rank Pass Snaps Production Premium Production Premium Rank QB Rating When Targeted QB Rating When Targeted Rank Receiving TDs Receiving Yards Receiving Yards Per Game Receiving Yards Per Game Rank Receiving Yards Rank Receptions Receptions Per Game Receptions Per Game Rank Receptions Rank Red Zone Carries Red Zone Carries Per Game Red Zone Carries Per Game Rank Red Zone Carries Rank Red Zone Catch Rate Red Zone Catch Rate Rank Red Zone Opportunities Red Zone Opportunity Share Red Zone Opportunity Share Rank Red Zone Receptions Red Zone Receptions Rank Red Zone Snap Share Red Zone Snap Share Rank Red Zone Snaps Red Zone TD Conversion Rate Red Zone TD Conversion Rate Rank Red Zone Target Share Red Zone Target Share Rank Red Zone Targets Red Zone Targets Rank Red Zone Touches Red Zone Touches Per Game Red Zone Touches Per Game Rank Red Zone Touches Rank Route Participation Route Participation Rank Run Blocking Efficiency Run Blocking Efficiency Rank Run Snaps Rush Yards Per Game Rush Yards Per Game Rank Rushing Touchdowns Rushing Touchdowns Rank Rushing Yards Rushing Yards Rank Season Shotgun Carry Rate Shotgun Carry Rate Rank Shotgun Yards Per Carry Shotgun Yards Per Carry Rank Slot Rate Slot Rate Rank Slot Snaps Slot YPR Slot YPR Rank Slot YPT Slot YPT Rank Snap Share Snap Share Rank Snap-weighted Game Script Snap-weighted Game Script Rank Snaps Stacked Front Carry Rate Stacked Front Carry Rate Rank Stacked Front Yards Per Carry Stacked Front Yards Per Carry Rank Stuffed Run Rate Stuffed Run Rate Rank Stuffed Runs Stuffed Runs Rank Target Share Target Share Rank Targets Targets Inside 10 Targets Inside 10 Per Game Targets Inside 5 Targets Inside 5 Per Game Targets Per Game Targets Per Game Rank Targets Rank Team Run Plays Team Run Plays Rank Total Touchdowns Total Touchdowns Rank Total Touches Total Yards Total Yards Per Game Total Yards Per Game Rank Total Yards Rank Touchdown Rate Touchdown Rate Rank True Yards Per Carry True Yards Per Carry Rank Under Center Carry Rate Under Center Carry Rate Rank Under Center Yards Per Carry Under Center Yards Per Carry Rank    VOS VOS Rank Weekly Volatility Weekly Volatility Rank Weighted Opportunities Weighted Opportunities Per Game Weighted Opportunities Per Game Rank Weighted Opportunities Rank Yards After Catch Yards After Catch Per Game Yards After Catch Per Reception Yards After Catch Per Target Yards After Catch Rank Yards Created Yards Created Per Game Yards Created Per Game Rank Yards Created Per Touch Yards Created Per Touch Rank Yards Created Rank Yards Per Carry Yards Per Carry Rank Yards Per Opportunity Yards Per Reception Yards Per Route Run Yards Per Route Run Rank Yards Per Target Yards Per Target Rank Yards Per Touch Yards Per Touch Rank               Name                                          Permalink            Team_ID
0                          6.4                               #54                 32.1%                        #41                        5.1                              #9        0               0.0           #61               4.2%                     #23             12                     0.8                           #7                  #6     287                32                          2               20             17.9                    #6           #4                      1.2      81.1%             #13                 86.7%                        #20               124            45.8%                    #2      2.1%            #35     3           0.19                 #20        #20                 4.2%                       #10               1             87                     5.4                           #8                  #3                   494.4                             30.9                                    #1                           #1                           +0.89                18.1          469.2                    29.3                                 -1.6                           #1                           1.09                                 #26                    0.47                          #10  Christian McCaffrey       -7.27              #31    16                 1                14                        0.9                              #2                     #1     21.6%            #34                  53.0%                         #15                         4.8                              #24                               27.7                                      #1           430             91.5%                     #1         2.42                #1         405                 25.3                        #1        647              +23.3                     #12                   102.4                          #12             4            1005                     62.8                            #1                   #1        116                 7.2                       #7              #1               56                       3.5                             #3                    #2               71.4%                      #19                     70                      95.9%                              #3                  10                       #2               86.8%                       #6            131                       20.0%                              #20                 21.9%                         #2               14                    #2               66                       4.1                             #2                    #2               64.0%                       #2                    89.9                           #5       356                86.7                       #4                 15                      #3          1387                 #3   2019              47.0%                     #19                     3.7                          #50      3.6%            #11         36      8.1            #7      5.9            #7      98.4%              #1                     -7.27                            #31  1003                    13.6%                           #40                           4.6                                #11            17.4%                   #45           50                #7        23.7%                #1     143                 6                        0.4                3                       0.2              8.9                    #1           #1           24.1                 #24               19                    #1           403        2392                149.5                        #1               #1           4.7%                 #16                  4.4                       #16                   53.3%                          #38                          5.8                                #4  +12.6       #1              10.0                    #10                  396.7                            24.8                                   #1                          #1               979                       61.2                             8.4                          6.8                     #1           415                   25.9                         #14                    1.03                          #52                #12             4.8                  #12                   5.6                 8.7                2.48                       #9              7.0                    #9             5.9                   #6  Carolina Panthers  https://www.playerprofiler.com/depth-charts/ca...  carolina-panthers

编辑

要获得其他玩家,需要更改有效载荷中的玩家ID。您可以从另一个api中获得播放器ID,然后不管您想要过滤它,可以迭代这些api,或者手动提取您想要的1:

代码语言:javascript
复制
import requests
import pandas as pd

# Get player IDs
url = 'https://www.playerprofiler.com/api/v1/players'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36'}
jsonData = requests.get(url, headers=headers).json()
players = jsonData['data']['Players']

rows = []
for player in players:
    row = {}
    print ('Aquired: %s' %player['Full Name'])
    url = 'https://www.playerprofiler.com/wp-admin/admin-ajax.php'
    payload = {
    'action': 'playerprofiler_api',
    'endpoint': '/player/%s' %player['Player_ID']}
    season = '2019'
    
    jsonData = requests.get(url, headers=headers, params=payload).json()
    if 'Performance Metrics' in jsonData['data']['Player'].keys():
        if season in jsonData['data']['Player']['Performance Metrics'].keys():
            data = jsonData['data']['Player']['Performance Metrics'][season]
        else:
            continue
    else:
        continue
    
    for each in [data, player]:
        row.update(each)
    rows.append(row)
       
df = pd.DataFrame(rows)

rb_df = df[df['Position'] == 'RB']
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63715284

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档