首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在几次成功的urlopen之后,开始获取403

在几次成功的urlopen之后,开始获取403
EN

Stack Overflow用户
提问于 2017-12-07 00:23:49
回答 1查看 119关注 0票数 0

我有一个Digikey产品页面的URL列表。我们的目标是打开每个网址,然后抓取定价信息并创建一个BoM。

我遇到的挑战是,在打开几个URL后,URLError开始出现403 (禁止)-即使我可以在我的(Chrome)浏览器中打开这些URL(在Mac上)。

从打开每个URL到决定在Python脚本中禁止打开URL,会有什么原因?谢谢!

代码如下:

代码语言:javascript
复制
from urllib.request import urlopen, Request, URLError
urls = ['https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RC0805JR-071KL',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=08055C333KAT2A',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=B72660M0251K072',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=HI1206T500R-10',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=LVR005NK-2',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RL1220S-120-F',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RMCF0805JT330R',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=IND-LED',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CHV1206-JW-224ELF',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RAC03-3.3SGA',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=202R18W102KV4E',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=GRM32DR72H104KW10L',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CRE1S0505S3C',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=SJ-3523-SMT-TR',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=ATM90E26-YU-RCT-ND',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CL21F104ZBCNNNC',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CL21A106KQCLRNC',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=535-9865-1-ND',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=c',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CL21C180JBANNNC',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=BLM15AG100SN1D',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RMCF0805JT51R0',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=SI8651BB-B-IS1']
#####################################
for url in urls:
    print(url)
    try:
        with urlopen(url) as response:
            html = response.read()
            print (html)
        print("DONE WITH THIS URL.")
    except URLError as e:
        print(e.reason)
EN

回答 1

Stack Overflow用户

发布于 2017-12-07 04:41:04

多亏了这些评论,数字键确实假设我的代码是一个机器人。“变通方法”包括:

  • 不使用URL中的脚本
  • 如果获得http 403,则随机选择不同的用户代理。

谢谢。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/47678971

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档