今天我带着一个关于这个项目的问题来了,这个问题很快就被回答了,所以我再次来到这里。下面的代码通过提供的网站进行抓取,提取数据,并为它正在抓取的表的实例添加一列。我面临的下一场战斗是将所有的Game实例加载到big_df中,其中包含一个列,以复制当前正在进行的游戏近况。如果有人能帮我完成最后一块拼图,我会很感激的。
https://www.fantasypros.com/daily-fantasy/nba/fanduel-defense-vs-position.php
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time as t
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
big_df = pd.DataFrame()
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")
webdriver_service = Service(r'chromedriver\chromedriver') ## path to where you saved chromedriver binary
driver = webdriver.Chrome(service=webdriver_service, options=chrome_options)
wait = WebDriverWait(driver, 20)
url = "https://www.fantasypros.com/daily-fantasy/nba/fanduel-defense-vs-position.php"
driver.get(url)
sleep(60)
tables_list = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//ul[@class="pills pos-filter pull-left"]/li')))
for x in tables_list:
x.click()
print('selected', x.text)
t.sleep(2)
table = wait.until(EC.element_to_be_clickable((By.XPATH, '//table[@id="data-table"]')))
df = pd.read_html(table.get_attribute('outerHTML'))[0]
df['Category'] = x.text.strip()
big_df = pd.concat([big_df, df], axis=0, ignore_index=True)
print('done, moving to next table')
print(big_df)
big_df.to_csv('fanduel.csv')发布于 2022-10-12 19:11:34
这样你才能实现你的最终目标:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time as t
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
big_df = pd.DataFrame()
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
chrome_options.add_argument("window-size=1280,720")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
driver = webdriver.Chrome(service=webdriver_service, options=chrome_options)
wait = WebDriverWait(driver, 20)
url = "https://www.fantasypros.com/daily-fantasy/nba/fanduel-defense-vs-position.php"
driver.get(url)
select_recency_options = [x.text for x in wait.until(EC.presence_of_all_elements_located((By.XPATH, '//select[@class="game-change"]/option')))]
for option in select_recency_options:
select_recency = Select(WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//select[@class="game-change"]'))))
select_recency.select_by_visible_text(option)
print('selected', option)
t.sleep(2)
tables_list = wait.until(EC.presence_of_all_elements_located((By.XPATH, '//ul[@class="pills pos-filter pull-left"]/li')))
for x in tables_list:
x.click()
print('selected', x.text)
t.sleep(2)
table = wait.until(EC.element_to_be_clickable((By.XPATH, '//table[@id="data-table"]')))
df = pd.read_html(table.get_attribute('outerHTML'))[0]
df['Category'] = x.text.strip()
df['Recency'] = option
big_df = pd.concat([big_df, df], axis=0, ignore_index=True)
print('done, moving to next table')
display(big_df)
big_df.to_csv('fanduel.csv')其结果是(更大的)数据:
Team PTS REB AST 3PM STL BLK TO FD PTS Category Recency
0 HOUHouston Rockets 23.54 9.10 5.10 2.54 1.88 1.15 2.65 48.55 ALL Season
1 OKCOklahoma City Thunder 22.22 9.61 5.19 2.70 1.67 1.18 2.52 47.57 ALL Season
2 PORPortland Trail Blazers 22.96 8.92 5.31 2.74 1.63 0.99 2.65 46.84 ALL Season
3 SACSacramento Kings 23.00 9.10 5.03 2.58 1.61 0.95 2.50 46.65 ALL Season
4 ORLOrlando Magic 22.35 9.39 4.94 2.62 1.57 1.04 2.50 46.36 ALL Season
... ... ... ... ... ... ... ... ... ... ... ...
715 TORToronto Raptors 23.33 13.97 2.77 0.57 0.84 1.88 3.38 49.03 C Last 30
716 NYKNew York Knicks 19.78 15.40 2.94 0.53 0.90 1.92 2.17 48.96 C Last 30
717 BKNBrooklyn Nets 19.69 13.60 3.16 0.86 1.10 2.25 2.06 48.74 C Last 30
718 BOSBoston Celtics 17.79 11.95 3.75 0.41 1.66 1.80 2.54 45.60 C Last 30
719 MIAMiami Heat 17.41 14.19 2.16 0.50 1.01 1.52 1.75 43.52 C Last 30
720 rows × 11 columnshttps://stackoverflow.com/questions/74046780
复制相似问题