我正在尝试获取App Store > Business的内容
import requests
from lxml import html
page = requests.get("https://itunes.apple.com/in/genre/ios-business/id6000?mt=8")
tree = html.fromstring(page.text)
flist = []
plist = []
for i in range(0, 100):
app = tree.xpath("//div[@class='column first']/ul/li/a/@href")
ap = app[0]
page1 = requests.get(ap)当我用(0,2)尝试range时,它可以工作,但当我将range放入100s中时,它显示以下错误:
Traceback (most recent call last):
File "/home/preetham/Desktop/eg.py", line 17, in <module>
page1 = requests.get(ap)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 383, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 486, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='itunes.apple.com', port=443): Max retries exceeded with url: /in/app/adobe-reader/id469337564?mt=8 (Caused by <class 'socket.gaierror'>: [Errno -2] Name or service not known)发布于 2014-07-23 06:55:39
这里发生的事情是itunes服务器拒绝你的连接(你在短时间内从同一个ip地址发送了太多的请求)
重试次数超过了url: /in/app/adobe-reader/id469337564?mt=8
错误跟踪是误导性的,它应该类似于“无法建立连接,因为目标计算机主动拒绝了它”。
在Github的关于python.requests库有一个问题,请查看here
要解决这个问题(这不是一个问题,因为它会误导调试跟踪),您应该捕获与连接相关的异常,如下所示:
try:
page1 = requests.get(ap)
except requests.exceptions.ConnectionError:
r.status_code = "Connection refused"解决这个问题的另一种方法是,如果你使用足够的时间间隔向服务器发送请求,这可以通过python中的sleep(timeinsec)函数来实现(别忘了导入睡眠)
from time import sleep所有的请求都是很棒的python库,希望它能解决你的问题。
发布于 2017-11-24 22:10:39
只需使用requests'功能即可:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)
session.get(url)这将GET该URL并在requests.exceptions.ConnectionError的情况下重试3次。backoff_factor将有助于在两次尝试之间应用延迟,以避免在定期请求配额的情况下再次失败。
看看requests.packages.urllib3.util.retry.Retry,它有许多选项来简化重试。
发布于 2017-03-09 17:00:59
就这么做,
粘贴以下代码以替换page = requests.get(url)
import time
page = ''
while page == '':
try:
page = requests.get(url)
break
except:
print("Connection refused by the server..")
print("Let me sleep for 5 seconds")
print("ZZzzzz...")
time.sleep(5)
print("Was a nice sleep, now let me continue...")
continue不客气:)
https://stackoverflow.com/questions/23013220
复制相似问题