我有一个Digikey产品页面的URL列表。我们的目标是打开每个网址,然后抓取定价信息并创建一个BoM。
我遇到的挑战是,在打开几个URL后,URLError开始出现403 (禁止)-即使我可以在我的(Chrome)浏览器中打开这些URL(在Mac上)。
从打开每个URL到决定在Python脚本中禁止打开URL,会有什么原因?谢谢!
代码如下:
from urllib.request import urlopen, Request, URLError
urls = ['https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RC0805JR-071KL',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=08055C333KAT2A',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=B72660M0251K072',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=HI1206T500R-10',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=LVR005NK-2',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RL1220S-120-F',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RMCF0805JT330R',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=IND-LED',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CHV1206-JW-224ELF',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RAC03-3.3SGA',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=202R18W102KV4E',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=GRM32DR72H104KW10L',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CRE1S0505S3C',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=SJ-3523-SMT-TR',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=ATM90E26-YU-RCT-ND',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CL21F104ZBCNNNC',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CL21A106KQCLRNC',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=535-9865-1-ND',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=c',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=CL21C180JBANNNC',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=BLM15AG100SN1D',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=RMCF0805JT51R0',
'https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=SI8651BB-B-IS1']
#####################################
for url in urls:
print(url)
try:
with urlopen(url) as response:
html = response.read()
print (html)
print("DONE WITH THIS URL.")
except URLError as e:
print(e.reason)发布于 2017-12-07 04:41:04
多亏了这些评论,数字键确实假设我的代码是一个机器人。“变通方法”包括:
谢谢。
https://stackoverflow.com/questions/47678971
复制相似问题