首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Selenium成功后,python请求库失败,尽管有相同的url和相同的请求头-有什么不同?

Selenium成功后,python请求库失败,尽管有相同的url和相同的请求头-有什么不同?
EN

Stack Overflow用户
提问于 2021-12-23 17:10:47
回答 2查看 714关注 0票数 1
代码语言:javascript
复制
# selenium-request.py

from seleniumwire import webdriver  # Import from seleniumwire

# Create a new instance of the Chrome driver
driver = webdriver.Chrome()

driver.get('https://www.cmegroup.com/content/cmegroup/en/tools-information/advisorySearch/jcr:content/full-par/cmeadvisorysearch.advisorySearch.advisorynotices:Advisory%20Notices.-.2.12|07|2021.01|01|2008.json')

for request in driver.requests:
    if request.response:
        print(request.response.headers)

当我运行这段代码时,我得到Selenium使用的标题:

代码语言:javascript
复制
$ python selenium-request.py
Accept-Ranges: bytes
Access-Control-Allow-Origin: http://star-website.com
Content-Type: application/json
ETag: W/"36b8a-5d3d28ed9cc43"
Last-Modified: Thu, 23 Dec 2021 16:16:16 GMT
Referrer-Policy: no-referrer-when-downgrade
Server: Apache
ServerID: e1
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Accept-Encoding
Content-Encoding: gzip
Cache-Control: max-age=86400
Date: Thu, 23 Dec 2021 16:16:16 GMT
Content-Length: 46236
Connection: keep-alive
Content-Security-Policy: frame-ancestors 'self' *.cmegroup.com *.quikstrike.net commodex.co.il openexchange.community.cmegroup.com staging.tickertocker.com http://www.straitsfinancial.com www.straitsfinancial.com http://straitsfinancial.com https://www.home.saxo https://app.topsteptrader.com https://help.topsteptrader.com https://staging.topsteptrader.com https://blueeditsitecore.sys.dom https://bluesitecore.sys.dom https://sitecoredev.orange.saxobank.com https://sitecoredev-nocache.orange.saxobank.com https://sitecoredevedit.orange.tst2.dom http://star-website.com https://www.investing.com https://*.benzinga.com https://bz.zingbot.bz https://www.zingbot.bz https://gdcdyn.interactivebrokers.com https://www.interactivebrokers.com https://zingbot.bz https://www.zingbot.bz https://m.zingbot.bz https://bz.zingbot.bz https://dev.futuresfirstacademy.com https://uat.futuresfirstacademy.com https://futuresfirstacademy.com http://stage.barchart.com http://www.barchart.com https://www.infinityfutures.com https://kilofutures.com https://m.cqg.com https://mdemo.cqg.com *.chicago.cme.com:7822 https://uatm.cqg.com https://local.zingbot.bz https://www.gulfbondsukuk.org www.kgieworld.sg https://www.propex24.wpcomstaging.com https://www.propex24.com *.straitsfinancial.gate39tech.com us.straitsfinancial.com https://*.kapcoclients.com https://kapcoclients.com https://*.wallstreetbound.org https://wallstreetbound.org https://cofcointl.plateau.com https://rise.articulate.com https://members.tradeday.com http://blf-django.herokuapp.com https://www.bluelinefutures.com https://www.bluelinefutures.live https://www.bluelinefutures.trade https://login.chicago.cme.com https://loginnr.chicago.cme.com https://logincert.chicago.cme.com https://login-ny.chicago.cme.com https://ampfutures.com https://cme.ampfutures.com https://*.advantagefutures.com https://*.e-futures.com https://*.etrade.com https://*.gffbrokers.com https://infinityfutures-cn.com https://sweetfutures.com https://*.tradovate.com https://home.saxo https://*.tickmill.co.uk https://*.directa.it https://big.pt https://*.tradestation-international.com https://*.stonex.com http://tradinglesson.com https://tradinglesson.com *.ibroker.it *.ibroker.es *.cornertrader.ch *.whselfinvest.com *.banxbroker.de *.ameritrade.com *.sweetfutures.com  *.danielstrading.com  *.gainfutures.com  *.futuresonline.com *.tdainc.com *.lsvp.com *.schwab.com *.schwab.co.uk *.us.global.schwab.com *.dev.schwab.com;
Set-Cookie: ak_bmsc=AB0A9701302106EABE2E195C6AC2A074~000000000000000000000000000000~YAAQLtERAvOZVN19AQAA7C8U6A7AWr7StAmiphZPltguFftPSOXgfa2NAq7Vts+40k7AdnPG55ULK1vyBRhPRdqWbtYml3JTC3RjHLu31l8kWBFvysYyuY2uz4GpkvmOWoBSN/Dl/2bQ9bEgbiYj3tCZ1o+wEvMfsiAWiJeMY3M1ozu6nyQz0JVpdvfsqun3z5wGhpJWhkjrJjeIyHvVdzx2uyIb1azRFlHT+nRCR6NHGoaMM/G2sI1DqPOXPB5btXjdncvB739c2Beh7RgWD/zvb78qpAJDUR1KOenDy1EwN2Bg8pqH1sxlsoVrl7i7r/pAOaWKfd4U1FKP7p730GfOp/m2VRBIdYgHDPHPvGeITPKrR/G22aR886r9Lerhug==; Domain=.cmegroup.com; Path=/; Expires=Thu, 23 Dec 2021 18:16:01 GMT; Max-Age=7185; HttpOnly

我将这些准确的头复制到python中,请求如下:

代码语言:javascript
复制
# python-request.py
import requests

headers = {
    "Accept-Ranges": "bytes",
    "Access-Control-Allow-Origin": "http://star-website.com",
    "Content-Type": "application/json",
    "ETag": 'W/"36b8a-5d3d28ed9cc43"',
    "Last-Modified": "Thu, 23 Dec 2021 16:16:16 GMT",
    "Referrer-Policy": "no-referrer-when-downgrade",
    "Server": "Apache",
    "ServerID": "e1",
    "Strict-Transport-Security": "max-age=31536000; includeSubDomains",
    "Vary": "Accept-Encoding",
    "Content-Encoding": "gzip",
    "Cache-Control": "max-age=86400",
    "Date": "Thu, 23 Dec 2021 16:16:16 GMT",
    "Content-Length": "46236",
    "Connection": "keep-alive",
    "Content-Security-Policy": "frame-ancestors 'self' *.cmegroup.com *.quikstrike.net commodex.co.il openexchange.community.cmegroup.com staging.tickertocker.com http://www.straitsfinancial.com www.straitsfinancial.com http://straitsfinancial.com https://www.home.saxo https://app.topsteptrader.com https://help.topsteptrader.com https://staging.topsteptrader.com https://blueeditsitecore.sys.dom https://bluesitecore.sys.dom https://sitecoredev.orange.saxobank.com https://sitecoredev-nocache.orange.saxobank.com https://sitecoredevedit.orange.tst2.dom http://star-website.com https://www.investing.com https://*.benzinga.com https://bz.zingbot.bz https://www.zingbot.bz https://gdcdyn.interactivebrokers.com https://www.interactivebrokers.com https://zingbot.bz https://www.zingbot.bz https://m.zingbot.bz https://bz.zingbot.bz https://dev.futuresfirstacademy.com https://uat.futuresfirstacademy.com https://futuresfirstacademy.com http://stage.barchart.com http://www.barchart.com https://www.infinityfutures.com https://kilofutures.com https://m.cqg.com https://mdemo.cqg.com *.chicago.cme.com:7822 https://uatm.cqg.com https://local.zingbot.bz https://www.gulfbondsukuk.org www.kgieworld.sg https://www.propex24.wpcomstaging.com https://www.propex24.com *.straitsfinancial.gate39tech.com us.straitsfinancial.com https://*.kapcoclients.com https://kapcoclients.com https://*.wallstreetbound.org https://wallstreetbound.org https://cofcointl.plateau.com https://rise.articulate.com https://members.tradeday.com http://blf-django.herokuapp.com https://www.bluelinefutures.com https://www.bluelinefutures.live https://www.bluelinefutures.trade https://login.chicago.cme.com https://loginnr.chicago.cme.com https://logincert.chicago.cme.com https://login-ny.chicago.cme.com https://ampfutures.com https://cme.ampfutures.com https://*.advantagefutures.com https://*.e-futures.com https://*.etrade.com https://*.gffbrokers.com https://infinityfutures-cn.com https://sweetfutures.com https://*.tradovate.com https://home.saxo https://*.tickmill.co.uk https://*.directa.it https://big.pt https://*.tradestation-international.com https://*.stonex.com http://tradinglesson.com https://tradinglesson.com *.ibroker.it *.ibroker.es *.cornertrader.ch *.whselfinvest.com *.banxbroker.de *.ameritrade.com *.sweetfutures.com  *.danielstrading.com  *.gainfutures.com  *.futuresonline.com *.tdainc.com *.lsvp.com *.schwab.com *.schwab.co.uk *.us.global.schwab.com *.dev.schwab.com;",
    "Set-Cookie": "ak_bmsc=AB0A9701302106EABE2E195C6AC2A074~000000000000000000000000000000~YAAQLtERAvOZVN19AQAA7C8U6A7AWr7StAmiphZPltguFftPSOXgfa2NAq7Vts+40k7AdnPG55ULK1vyBRhPRdqWbtYml3JTC3RjHLu31l8kWBFvysYyuY2uz4GpkvmOWoBSN/Dl/2bQ9bEgbiYj3tCZ1o+wEvMfsiAWiJeMY3M1ozu6nyQz0JVpdvfsqun3z5wGhpJWhkjrJjeIyHvVdzx2uyIb1azRFlHT+nRCR6NHGoaMM/G2sI1DqPOXPB5btXjdncvB739c2Beh7RgWD/zvb78qpAJDUR1KOenDy1EwN2Bg8pqH1sxlsoVrl7i7r/pAOaWKfd4U1FKP7p730GfOp/m2VRBIdYgHDPHPvGeITPKrR/G22aR886r9Lerhug==; Domain=.cmegroup.com; Path=/; Expires=Thu, 23 Dec 2021 18:16:01 GMT; Max-Age=7185; HttpOnly"
}


requests.get(
    "https://www.cmegroup.com/content/cmegroup/en/tools-information/advisorySearch/jcr:content/full-par/cmeadvisorysearch.advisorySearch.advisorynotices:Advisory%20Notices.-.2.12|07|2021.01|01|2008.json",
    headers=headers)

当我运行它时,它只是无限期地挂起,所以请求有一些问题。

除了标题之外,python和Selenium提出的请求之间有什么区别--我如何才能识别这个问题,并希望它能够与python请求库一起工作呢?

更新

我更新了代码以获得request.headers

代码语言:javascript
复制
Host: www.cmegroup.com
Connection: keep-alive
sec-ch-ua: " Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Linux"
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9

..。但是python请求脚本在使用这些标头时有相同的结果,只是挂起(或者,如果我设置了超时值参数,就会超时)。

进一步更新

调试输出如下:

代码语言:javascript
复制
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): www.cmegroup.com:443
send: b'GET /content/cmegroup/en/tools-information/advisorySearch/jcr:content/full-par/cmeadvisorysearch.advisorySearch.advisorynotices:Advisory%20Notices.-.2.12%7C07%7C2021.01%7C01%7C2008.json HTTP/1.1\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36\r\nAccept-Encoding: gzip, deflate, br\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9\r\nConnection: keep-alive\r\nHost: www.cmegroup.com\r\nsec-ch-ua: " Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"\r\nsec-ch-ua-mobile: ?0\r\nsec-ch-ua-platform: Linux\r\nUpgrade-Insecure-Requests: 1\r\nSec-Fetch-Site: none\r\nSec-Fetch-Mode: navigate\r\nSec-Fetch-User: ?1\r\nSec-Fetch-Dest: document\r\nAccept-Language: en-US,en;q=0.9\r\n\r\n'
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2021-12-27 16:03:09

看起来它只需要一个兼容的用户代理头。

代码语言:javascript
复制
import requests
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0',
}

url = 'https://www.cmegroup.com/content/cmegroup/en/tools-information/advisorySearch/jcr:content/full-par/cmeadvisorysearch.advisorySearch.advisorynotices:Advisory%20Notices.-.2.12|07|2021.01|01|2008.json'

response = requests.get(url, headers = headers, timeout = 30) # A
print(response.status_code)    # Prints 200 (OK).
print(response.json())         # Prints the output as JSON. "item" key has 50 values in a list.

这个片段对我起了作用。

票数 1
EN

Stack Overflow用户

发布于 2021-12-24 07:06:40

看起来,您使用的是响应头,而不是请求头。试一试

代码语言:javascript
复制
print(request.headers)
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/70465344

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档