我试图刮谷歌知识面板,以检索药物的名称,如果他们没有出现在谷歌搜索。例如,如果我在Google中查找"Buscopan“,出现的网页如下所示:

现在,我试图用显示的代码在知识面板中使用术语“”,但是一旦我检查元素,实际上就无法在html代码中检索它。准确地说。我与错误消息一起实现的代码如下:
import requests
from bs4 import BeautifulSoup URL
url = "https://www.google.com/search?client=safari&rls=en&q="+"buscopan"+"&ie=UTF-8&oe=UTF-8“
# Sending HTTP request
req = requests.get(url)
# Pulling HTTP data from internet
sor = BeautifulSoup(req.text, "html.parser")
temp = sor.find("h2", class_= "qrShPb kno-ecr-pt PZPZlf mfMhoc hNKfZe").text
print(temp)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-39-ef5599a1a1fc> in <module>
13 # Finding temperature in Celsius
14 #temp = sor.find("h2", class_='qrShPb').text
---> 15 temp = sor.find("h2", class_= "qrShPb kno-ecr-pt PZPZlf mfMhoc hNKfZe").text
16
17
AttributeError: 'NoneType' object has no attribute 'text'我不知道我做错了什么。我认为我需要查看的html代码如下:
<h2 class="qrShPb kno-ecr-pt PZPZlf mfMhoc hNKfZe" data-local-attribute="d3bn" data-attrid="title" data-ved="2ahUKEwjujfLcgO7rAhWKjosKHSiBAFEQ3B0oATASegQIEBAL"></h2>
当然,其余的html代码是在图片报告,但如果你需要一个更大的版本,请,不要激动!
有什么建议吗?
谢谢,
费德里科
发布于 2020-09-16 16:27:30
要从Google获得正确的结果页面,请指定User-Agent header。例如:
import requests
from bs4 import BeautifulSoup
params = {
'q': 'buscopan', # <-- change to your keyword
'hl': 'it' # <-- change to `en` for english results
}
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0'}
url = 'https://www.google.com/search'
soup = BeautifulSoup(requests.get(url, params=params, headers=headers).content, 'html.parser')
print(soup.select_one('h2[data-attrid="title"]').text)指纹:
Scopolamina-N-butilbromuro发布于 2021-04-09 07:54:12
或者,对于安德烈·凯斯利解决方案,您可以使用来自SerpApi的第三方Google知识图API。这是一个有免费计划的付费API。查看要测试的游乐场。
要集成的代码和在线IDE中的完整示例
from serpapi import GoogleSearch
import os
params = {
"q": "Buscopan",
"google_domain": "google.com",
"hl": "en",
"api_key": os.getenv("API_KEY"),
}
search = GoogleSearch(params)
results = search.get_dict()
title = results['knowledge_graph']['title']
print(title)输出:
ButylscopolamineJSON知识图输出的一部分:
"knowledge_graph": {
"title": "Butylscopolamine",
"type": "Medication",
"description": "Hyoscine butylbromide, also known as scopolamine butylbromide and sold under the brandname Buscopan among others, is an anticholinergic medication used to treat crampy abdominal pain, esophageal spasms, renal colic, and bladder spasms. It is also used to improve respiratory secretions at the end of life.",
"source": {
"name": "Wikipedia",
"link": "https://en.wikipedia.org/wiki/Hyoscine_butylbromide"
},
"formula": "C₂₁H₃₀BrNO₄",
"molar_mass": "440.371 g/mol",
"chem_spider_id": "16736107",
"trade_name": "Buscopan, others",
"pub_chem_cid": "6852391",
"ch_ebi_id": "32123",
"people_also_search_for": "Scopolamine, Metamizole, MORE"
}免责声明,我为SerpApi工作。
https://stackoverflow.com/questions/63924038
复制相似问题