我正在编写一个脚本,以便并行执行数百万个API调用。
我使用Python3.6和aiohttp来达到这个目的。我原以为uvloop会让它更快,但它似乎让它变慢了。我做错了什么吗?
使用uvloop: 22秒
不带uvloop: 15秒
import asyncio
import aiohttp
import uvloop
import time
import logging
from aiohttp import ClientSession, TCPConnector
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger()
urls = ["http://www.yahoo.com","http://www.bbcnews.com","http://www.cnn.com","http://www.buzzfeed.com","http://www.walmart.com","http://www.emirates.com","http://www.kayak.com","http://www.expedia.com","http://www.apple.com","http://www.youtube.com"]
bigurls = 10 * urls
def run(enable_uvloop):
try:
if enable_uvloop:
loop = uvloop.new_event_loop()
else:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
start = time.time()
conn = TCPConnector(limit=5000, use_dns_cache=True, loop=loop, verify_ssl=False)
with ClientSession(connector=conn) as session:
tasks = asyncio.gather(*[asyncio.ensure_future(do_request(url, session)) for url in bigurls]) # tasks to do
results = loop.run_until_complete(tasks) # loop until done
end = time.time()
logger.debug('total time:')
logger.debug(end - start)
return results
loop.close()
except Exception as e:
logger.error(e, exc_info=True)
async def do_request(url, session):
"""
"""
try:
async with session.get(url) as response:
resp = await response.text()
return resp
except Exception as e:
logger.error(e, exc_info=True)
run(True)
#run(False)发布于 2019-10-09 13:11:14
你并不孤单;实际上我只是得到了类似的结果(这导致我在谷歌上搜索我的发现,并将我带到这里)。
我的实验包括使用aiohttp向Google.com运行500个并发GET请求。
下面是供参考的代码:
import asyncio, aiohttp, concurrent.futures
from datetime import datetime
import uvloop
class UVloopTester():
def __init__(self):
self.timeout = 20
self.threads = 500
self.totalTime = 0
self.totalRequests = 0
@staticmethod
def timestamp():
return f'[{datetime.now().strftime("%H:%M:%S")}]'
async def getCheck(self):
async with aiohttp.ClientSession() as session:
response = await session.get('https://www.google.com', timeout=self.timeout)
response.close()
await session.close()
return True
async def testRun(self, id):
now = datetime.now()
try:
if await self.getCheck():
elapsed = (datetime.now() - now).total_seconds()
print(f'{self.timestamp()} Request {id} TTC: {elapsed}')
self.totalTime += elapsed
self.totalRequests += 1
except concurrent.futures._base.TimeoutError: print(f'{self.timestamp()} Request {id} timed out')
async def main(self):
await asyncio.gather(*[asyncio.ensure_future(self.testRun(x)) for x in range(self.threads)])
def start(self):
# comment these lines to toggle
uvloop.install()
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
loop = asyncio.get_event_loop()
now = datetime.now()
loop.run_until_complete(self.main())
elapsed = (datetime.now() - now).total_seconds()
print(f'{self.timestamp()} Main TTC: {elapsed}')
print()
print(f'{self.timestamp()} Average TTC per Request: {self.totalTime / self.totalRequests}')
if len(asyncio.Task.all_tasks()) > 0:
for task in asyncio.Task.all_tasks(): task.cancel()
try: loop.run_until_complete(asyncio.gather(*asyncio.Task.all_tasks()))
except asyncio.CancelledError: pass
loop.close()
test = UVloopTester()
test.start()我没有计划和执行任何类型的仔细实验,我记录我的发现,并计算标准差和p值。但我已经运行了(令人疲倦的)多次,并得出了以下结果。
不使用uvloop运行:
使用uvloop运行:
完成
我已经与我的一个朋友分享了这段代码,他实际上是那个建议我尝试uvloop的人(因为他从中获得了速度提升)。在运行了几次之后,他的结果证实了使用uvloop (平均完成()和请求的时间更短)确实看到了速度的提高。
我们的发现让我相信,我们发现的差异与我们的设置有关:我在一台中端笔记本电脑上使用的是8 GB内存的Debian虚拟机,而他使用的是原生Linux台式机,在引擎盖下有更多的“肌肉”。
我对你的问题的回答是:不,我不相信你做错了什么,,因为我正在经历同样的结果,尽管欢迎和欣赏任何建设性的批评,但似乎我没有做错任何事情。
我希望我能帮上更多的忙,我希望我的插话能有一些用处。
发布于 2018-11-29 18:56:48
我尝试了一个类似的实验,发现并行http GET的uvloop和asyncio事件循环没有真正的区别:
asyncio event loop: avg=3.6285968542099 s. stdev=0.5583842811362075 s.
uvloop event loop: avg=3.419699764251709 s. stdev=0.13423859428541632 s.当uvloop在服务器代码中使用时,即处理许多传入的请求时,uvloop的显著好处可能会发挥作用。
代码:
import time
from statistics import mean, stdev
import asyncio
import uvloop
import aiohttp
urls = [
'https://aws.amazon.com', 'https://google.com', 'https://microsoft.com', 'https://www.oracle.com/index.html'
'https://www.python.org', 'https://nodejs.org', 'https://angular.io', 'https://www.djangoproject.com',
'https://reactjs.org', 'https://www.mongodb.com', 'https://reinvent.awsevents.com',
'https://kafka.apache.org', 'https://github.com', 'https://slack.com', 'https://authy.com',
'https://cnn.com', 'https://fox.com', 'https://nbc.com', 'https://www.aljazeera.com',
'https://fly4.emirates.com', 'https://www.klm.com', 'https://www.china-airlines.com',
'https://en.wikipedia.org/wiki/List_of_Unicode_characters', 'https://en.wikipedia.org/wiki/Windows-1252'
]
def timed(func):
async def wrapper():
start = time.time()
await func()
return time.time() - start
return wrapper
@timed
async def main():
conn = aiohttp.TCPConnector(use_dns_cache=False)
async with aiohttp.ClientSession(connector=conn) as session:
coroutines = [fetch(session, url) for url in urls]
await asyncio.gather(*coroutines)
async def fetch(session, url):
async with session.get(url) as resp:
await resp.text()
asycio_results = [asyncio.run(main()) for i in range(10)]
print(f'asyncio event loop: avg={mean(asycio_results)} s. stdev={stdev(asycio_results)} s.')
# Change to uvloop
asyncio.set_event_loop_policy(uvloop.EventLoopPolicy())
uvloop_results = [asyncio.run(main()) for i in range(10)]
print(f'uvloop event loop: avg={mean(uvloop_results)} s. stdev={stdev(uvloop_results)} s.')发布于 2019-07-14 20:36:34
aiohttp建议使用aiodns
另外,我记得,这个with ClientSession(connector=conn) as session:应该是异步的
https://stackoverflow.com/questions/47233547
复制相似问题