因此,我一直在努力寻找一种解决方案,它不需要我删除httpx并将其替换为另一个库,特别是因为http2 2/异步库的可用性微乎其微。
当我等待httpx的团队回复我的时候,我想在这里做一个理智的检查,看看我看到的是库中的潜在问题,还是我的经验不足。
代码:
import httpx
import asyncio
from memory_profiler import profile
import aiohttp
@profile(precision=4)
async def memory_test(url):
'''
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
html = await response.text()
print(f'Length of response is: {len(html)}')
'''
async with httpx.AsyncClient(http2=True) as client:
html = await client.get(url, follow_redirects=True)
print(f'Length of response is: {len(html.text)}')
del html
return None
async def main():
url = 'https://www.autoscout24.fr/offres/bmw-320-serie-3-touring-e91-touring-163ch-pack-m-sport-diesel-bleu-671904de-6139-4061-a451-f63bdb61de2b'
result = await memory_test(url)
if __name__ == "__main__":
asyncio.run(main()) 通过内存分析器运行它将给我提供:
Line # Mem usage Increment Occurrences Line Contents
=============================================================
10 84.7266 MiB 84.7266 MiB 1 @profile(precision=4)
11 async def memory_test(url):
12
13 '''
14 async with aiohttp.ClientSession() as session:
15 async with session.get(url) as response:
16 html = await response.text()
17 print(f'Length of response is: {len(html)}')
18
19 '''
20 89.1055 MiB 1.8125 MiB 4 async with httpx.AsyncClient(http2=True) as client:
21
22 88.2461 MiB 1.7070 MiB 91 html = await client.get(url, follow_redirects=True)
23 89.1055 MiB 0.8594 MiB 1 print(f'Length of response is: {len(html.text)}')
24
25
26 89.1055 MiB 0.0000 MiB 1 del html
27 89.1055 MiB 0.0000 MiB 1 return None一个300‘t的页面最终占用了从未发布过的内存的4mb+。运行几千个URL,很快就消耗掉了所有可用的内存。
然而,当切换到aiohttp时,情况就不同了:
Line # Mem usage Increment Occurrences Line Contents
=============================================================
10 84.6523 MiB 84.6523 MiB 1 @profile(precision=4)
11 async def memory_test(url):
12
13
14 88.2812 MiB 0.0000 MiB 3 async with aiohttp.ClientSession() as session:
15 88.2812 MiB 2.2344 MiB 7 async with session.get(url) as response:
16 88.2812 MiB 1.3945 MiB 3 html = await response.text()
17 88.2812 MiB 0.0000 MiB 1 print(f'Length of response is: {len(html)}')
18
19 88.2812 MiB 0.0000 MiB 1 '''
20 async with httpx.AsyncClient(http2=True) as client:
21
22 html = await client.get(url, follow_redirects=True)
23 print(f'Length of response is: {len(html.text)}')
24 '''
25
26 87.6484 MiB -0.6328 MiB 1 del html
27 87.6484 MiB 0.0000 MiB 1 return None这是httpx的问题,还是我期望Python带来的一些不现实的东西?参考文献:https://github.com/encode/httpx/discussions/2414
谢谢
发布于 2022-10-21 09:23:09
在听取了times的建议之后,我扩展了我的测试,多次请求一个URL,看看是否会发生实际的OOM问题。在运行了150次以上的函数之后,得到了一个100 in的页面,我看到内存的使用水平很快就达到了水平,并且在运行结束时,只有大约5MB的使用量增加了。
这个测试的代码可以在https://github.com/encode/httpx/discussions/2414上看到
https://stackoverflow.com/questions/74137279
复制相似问题