我对Python比较陌生,我的问题是:
我想指定一个网站,并有一个Python模块(例如。( BeautifulSoup)刮掉标题标题,如果与单词列表中的任何单词匹配,则打印"Bingo“,否则打印"nothing here”
下面是我的代码,对于我如何使这个工作有任何建议或想法吗?
import urllib.request
from bs4 import BeautifulSoup
Match = ("Whois", "domain", "IP", "search")
soup = BeautifulSoup(urllib.request.Request("https://whois.domaintools.com/"))
if (soup.title.string in Match):
print ("Bingo")
else:
print ("Nothing here!")发布于 2017-05-26 01:02:17
使用“requests模块”:
import requests
from bs4 import BeautifulSoup
r = requests.get('https://whois.domaintools.com/')
soup = BeautifulSoup(r.text, 'html.parser')
print(r.text)这将打印以下消息:
Please contact memberservices@domaintools.com and reference error #4311我偷偷怀疑这可能是因为他们挡住了铲运机。实际上,当我们指定类似于浏览器的用户代理时,它现在正确地加载页面。因此,固定版本变成:
import requests
from bs4 import BeautifulSoup
Match = ("Whois", "domain", "IP", "search")
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
r = requests.get('https://whois.domaintools.com/', headers=headers)
soup = BeautifulSoup(r.text, 'html.parser')
for m in Match:
if m in soup.title.string:
print('Bingo!')
break # Exit checking loophttps://stackoverflow.com/questions/44191561
复制相似问题