我试图在下面的html部分中找到评论的值计数,但我无法这样做,我已经尝试使用类名,css选择器等,但它无法fid元素。任何帮助都将不胜感激,下面是html部分。我也有多个这样的元素,我必须循环和获得审查计数,我该怎么做呢?
<a class="reviewsCount ml-5 fleft blue-text " href="https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews?utm_campaign=srp_ratings&utm_medium=desktop&utm_source=naukri" target="_blank" title="Powered by Ambition Box">(2148 Reviews)</a>
<a class="reviewsCount ml-5 fleft blue-text " href="https://www.ambitionbox.com/reviews/dxc-technology-reviews?utm_campaign=srp_ratings&utm_medium=desktop&utm_source=naukri" target="_blank" title="Powered by Ambition Box">(3919 Reviews)</a>发布于 2021-06-25 01:22:24
您可以使用以下命令获取元素文本:
all_text = driver.find_element_by_xpath("//a[contains(@href,'https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews')]").text现在你可以用下面的代码来提取评论的数量:
reviews = int(filter(str.isdigit, all_text))或者用这个:
import re
reviews = re.findall('\d+', all_text)在访问元素之前,不要忘记等待/延迟,以确保它已完全加载
发布于 2021-06-25 01:24:02
您可以将BeautifulSoap与selenium混合,
from bs4 import BeautifulSoup
data = """<a class="reviewsCount ml-5 fleft blue-text " href="https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews?utm_campaign=srp_ratings&utm_medium=desktop&utm_source=naukri" target="_blank" title="Powered by Ambition Box">(2148 Reviews)</a>"""
soup = BeautifulSoup(data, 'html.parser')
finds = soup.find('a', {'class': 'reviewsCount'})
print(finds.text)发布于 2021-06-25 01:39:45
尝试使用css selector:
a[href*='https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews'][class^='reviewsCount']代码:
wait = WebDriverWait(driver, 10)
total_review_count = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a[href*='https://www.ambitionbox.com/reviews/larsen-and-toubro-infotech-reviews'][class^='reviewsCount']")))
print(total_review_count.text)导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EChttps://stackoverflow.com/questions/68120104
复制相似问题