我正试着从红鳍上取回房子的销售价格。
以下是HTML的一部分:
<div class="timeline"><div class="property-history-content-container"><div class="timeline-content"><h4 class="section-header col-12">Today</h4><div class="sold-row row PropertyHistoryEventRow" id="propertyHistory-0"><div class="col-4"><p>Oct 15, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Sold (MLS) (Closed)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">$302,000<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class=" row PropertyHistoryEventRow" id="propertyHistory-1"><div class="col-4"><p>Sep 16, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Contingent (Active Under Contract)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">—<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class=" row PropertyHistoryEventRow" id="propertyHistory-2"><div class="col-4"><p>Sep 8, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Pending (Pending - Taking Backups)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">—<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class=" row PropertyHistoryEventRow" id="propertyHistory-3"><div class="col-4"><p>Sep 5, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Listed (Active)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">$294,900<span class="number empty"> </span></div><p class="subtext">Price</p></div></div></div></div></div>下面是邮政编码的一部分:
url = 'https://www.redfin.com/TX/Cedar-Park/615-Fence-Post-Pass-78613/home/32939011'
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')
soup.find_all(class_ = 'timeline-content')然而,代码不返回“出售”事件,只有3个事件之前出售。
下面是soup.find_all(class_ = 'timeline-content')的结果
[<div class="timeline-content"><h4 class="section-header col-12">Today</h4><div class="row PropertyHistoryEventRow" id="propertyHistory-0"><div class="col-4"><p>Sep 16, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Contingent (Active Under Contract)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">—<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class="row PropertyHistoryEventRow" id="propertyHistory-1"><div class="col-4"><p>Sep 8, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Pending (Pending - Taking Backups)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">—<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class="row PropertyHistoryEventRow" id="propertyHistory-2"><div class="col-4"><p>Sep 5, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Listed (Active)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number empty">**<span class="number empty"> </span></div><p class="subtext">Price</p></div></div></div>]发布于 2020-12-29 08:55:39
添加您想要找到的标签,也可以使用soup.find('div', class_='timeline-content')。
基于提供的html的示例
from bs4 import BeautifulSoup
html = """
<div class="timeline"><div class="property-history-content-container"><div class="timeline-content"><h4 class="section-header col-12">Today</h4><div class="sold-row row PropertyHistoryEventRow" id="propertyHistory-0"><div class="col-4"><p>Oct 15, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Sold (MLS) (Closed)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">$302,000<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class=" row PropertyHistoryEventRow" id="propertyHistory-1"><div class="col-4"><p>Sep 16, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Contingent (Active Under Contract)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">—<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class=" row PropertyHistoryEventRow" id="propertyHistory-2"><div class="col-4"><p>Sep 8, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Pending (Pending - Taking Backups)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">—<span class="number empty"> </span></div><p class="subtext">Price</p></div></div><div class=" row PropertyHistoryEventRow" id="propertyHistory-3"><div class="col-4"><p>Sep 5, 2020</p><p class="subtext">Date</p></div><div class="description-col col-4"><div>Listed (Active)</div><div></div><p class="subtext">ACTRIS #5085856</p></div><div class="col-4"><div class="price-col number">$294,900<span class="number empty"> </span></div><p class="subtext">Price</p></div></div></div></div></div>
"""
soup=BeautifulSoup(html,'html.parser')
soup.find('div', class_='timeline-content').find('div', class_='price-col number').text输出
$302,000注意你必须登录才能得到你提供的html,否则它不会包含你想刮的所有信息,例如price
https://stackoverflow.com/questions/65441095
复制相似问题