我能够获得具有典型数据的元素:
<input id="mbb-offeringID-1" type="checkbox" name="offeringID.1" value="cLtX5MPovjIvCLvcPPMSjATUpzLHNW%2Bk9pGa3%2BTkldS92roGTDDM%2B8BvfXGJP5GWE3DNQLNkJUny6cgknOlP%2F3xEODHJiPzzb3Io7oXbgDNP9cSrVkoaA5JVvTc1IPb%2BEcbB%2BeqhE7BAczO81wFLlD3SbtD55y7hOIa5DYQLzkaI9FHJTuyAphRUriSbCRuS">使用
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, "lxml")
title = soup.find('input', {'id': 'mbb-offeringID-1'}).get('value')
print(title) 如何使用隐藏输入检索值,例如:
<input type="hidden" name="offeringID.1" value="9Lt1oDtQ%2BIAdndBuUQBzl%2FXSUE8quGoqB41HEfz9IncLO4u3HybZ3EWtylW8vTJ1v3KZOS%2FPQRFGN6L0a0pjYFd8KcQ%2Bok3AsTNXxrQUaar1gXa7EHhACX2c%2Bh72E3izLUOwM4q6Wxw%3D">这是完整的html
<div class="a-fixed-right-grid-col aod-atc-column a-col-right" style="width:150px;margin-right:-150px;float:left;">
<form method="post" action="/gp/add-to-cart/html/ref=aod_dpdsk_new_0" class="aod-atc-form-header-desktop a-spacing-none">
<input type="hidden" name="session-id" value="146-6039598-0678601">
<input type="hidden" name="offeringID.1" value="9Lt1oDtQ%2BIAdndBuUQBzl%2FXSUE8quGoqB41HEfz9IncLO4u3HybZ3EWtylW8vTJ1v3KZOS%2FPQRFGN6L0a0pjYFd8KcQ%2Bok3AsTNXxrQUaar1gXa7EHhACX2c%2Bh72E3izLUOwM4q6Wxw%3D">发布于 2021-05-05 16:25:58
由于您还希望将"name = offeringID.1"作为关键字参数传递给find_all(),所以我将其标记为重复。我会在这里发布一个解决方案。您可以添加attrs=参数:
for tag in soup.find_all("input", type="hidden", attrs={"name": "offeringID.1"}):
print(tag["value"])编辑:数据是通过Ajax从外部加载的,您得到的“值”如下:
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36",
"referer": "https://www.amazon.ca/gp/aod/ajax/ref=auto_load_aod?asin=B07RF237B1&pc=dp",
}
params = (
("asin", "B07RF237B1"),
("pc", "dp"),
)
response = requests.get(
"https://www.amazon.ca/gp/aod/ajax/ref=auto_load_aod",
headers=headers,
params=params,
)
soup = BeautifulSoup(response.content, "html.parser")
print(soup.find("input", type="hidden", attrs={"name": "offeringID.1"})["value"])发布于 2021-05-05 16:01:44
你可以做到的
from bs4 import BeautifulSoup
page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, "lxml")
hidden_tags = soup.find_all("input", type="hidden")
for tag in hidden_tags:
print(tag )https://stackoverflow.com/questions/67404699
复制相似问题