我想在使用find_all时忽略一个类。我遵循了这个解决方案,Select all divs except ones with certain classes in BeautifulSoup
我的div有点不同,我想忽略描述-0。
<div class="abc">...</div>
<div class="parent">
<div class="description-0"></div>
<div class="description-1"></div>
<div class="description-2"></div>
</div>
<div class="xyz">...</div>以下是我的代码
classToIgnore = ["description-0"]
all = soup.find_all('div', class_=lambda x: x not in classToIgnore)它正在阅读页面上的所有div,而不仅仅是那些带有“descriptions n”的div。怎么修呢?
发布于 2022-01-04 09:58:01
例如,使用regex,例如:
import re
from bs4 import BeautifulSoup
sample_html = """<div class="abc">...</div>
<div class="description-0"></div>
<div class="description-1"></div>
<div class="description-2"></div>
<div class="xyz">...</div>"""
classes_regex = (
BeautifulSoup(sample_html, "lxml")
.find_all("div", {"class": (re.compile(r"description-[1-9]"))})
)
print(classes_regex)输出:
[<div class="description-1"></div>, <div class="description-2"></div>]https://stackoverflow.com/questions/70576806
复制相似问题