我试图从这个页面,https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm和其他类似的页面抓取一个表格。
所讨论的表有一个动态id table-XXXX,其中,每次页面加载时,X是不同的编号。
该表具有以下属性:
class="tablesaw tablesaw-stack table-bordered table-centered rates-availability-table"
data-tablesaw-mode="stack"我已经尝试过以下的变体来定位这个表(在查阅了这个帖子如何使用python在selenium中按其id名称的一部分查找元素之后),但是似乎没有什么可行的。
find_elements_by_css_selector("[id*='tab']")
find_elements_by_css_selector("[class*='tablesaw']")
find_elements_by_css_selector("[data-tablesaw-mode*='stack']")发布于 2020-06-28 15:06:28
表WebElement是阿贾克斯元素,因此要打印必须为visibility_of_element_located()导出WebDriverWait的值,可以使用以下任何一个定位器策略
CSS_SELECTOR:
driver.get('https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm')打印(WebDriverWait(驱动程序,https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm'))XPATH:
20).until(EC.visibility_of_element_located((By.XPATH,driver.get('https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm')打印(WebDriverWait(驱动程序,WebDriverWait "//table@class='tablesaw tablesaw-堆栈表-以表为中心的比特率-可用性表“).text)发布于 2020-06-28 14:40:13
数据通过JavaScript动态加载。但是您可以使用它们的API加载表。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://www.holidayfrancedirect.co.uk/holiday-rentals/RG007075/index.htm'
rates_url = 'https://www.holidayfrancedirect.co.uk/api/property-rates/{property_id}/2020'
property_id = url.split('/')[-2]
data = requests.get(rates_url.format(property_id=property_id)).json()
soup = BeautifulSoup(data['ratesHtml'], 'html.parser')
# print table to screen:
for tr in soup.select('tr'):
tds = [td.get_text(strip=True) for td in tr.select('td, th')]
print(('{:<15}'*7).format(*tds))指纹:
Start Date End Date 3 Nights 4 Nights 5 Nights 6 Nights 7 Nights
28 Mar 2020 1 May 2020 £225 £300 £350 £410 £470
2 May 2020 26 Jun 2020 £250 £330 £400 £460 £530
27 Jun 2020 3 Jul 2020 - - - - £675
4 Jul 2020 10 Jul 2020 - - - - £920
11 Jul 2020 14 Aug 2020 - - - - £985
15 Aug 2020 21 Aug 2020 - - - - £920
22 Aug 2020 28 Aug 2020 - - - - £675
29 Aug 2020 31 Oct 2020 - - - - £470 https://stackoverflow.com/questions/62622511
复制相似问题