我用请求-期货抓取网页的asynchronously.And我的机器有多核,所以我也想同时抓取许多网站,然后我尝试使用concurrent.futures,似乎concurrent.futures也提供异步方法,那么concurrent.futures‘异步和请求-期货的异步有什么区别?如果它们相同,意味着我可以拒绝请求-期货?
发布于 2014-07-26 03:31:32
requests-futures只是concurrent.futures上的一个很小的包装器。通过查看源代码 (为了简洁起见,删除了docstring),您可以看到这一点:
from concurrent.futures import ThreadPoolExecutor
from requests import Session
from requests.adapters import DEFAULT_POOLSIZE, HTTPAdapter
class FuturesSession(Session):
def __init__(self, executor=None, max_workers=2, *args, **kwargs):
super(FuturesSession, self).__init__(*args, **kwargs)
if executor is None:
executor = ThreadPoolExecutor(max_workers=max_workers)
# set connection pool size equal to max_workers if needed
if max_workers > DEFAULT_POOLSIZE:
adapter_kwargs = dict(pool_connections=max_workers,
pool_maxsize=max_workers)
self.mount('https://', HTTPAdapter(**adapter_kwargs))
self.mount('http://', HTTPAdapter(**adapter_kwargs))
self.executor = executor
def request(self, *args, **kwargs):
func = sup = super(FuturesSession, self).request
background_callback = kwargs.pop('background_callback', None)
if background_callback:
def wrap(*args_, **kwargs_):
resp = sup(*args_, **kwargs_)
background_callback(self, resp)
return resp
func = wrap
return self.executor.submit(func, *args, **kwargs) # This returns a concurrent.futures.Future当您使用requests-futures时,您实际上使用的是concurrent.futures.ThreadPoolExecutor,当您将一个任务交给它时,它返回一个concurrent.futures.Future。如果您更方便地使用requests-futures提供的API来处理HTTP请求,那么可以坚持使用它,甚至可以使用它返回的对象以及concurrent.futures模块提供的其他方法。
https://stackoverflow.com/questions/24967101
复制相似问题