我正在尝试使用wind网站收集英国不同气象站的风速数据。我假设他们有一个API,我只是很难连接到它。下面是我使用的XHR链接:
https://api.weather.com/v1/location/EGNV:9:GB/observations/historical.json?apiKey=6532d6454b8aa370768e63d6ba5a832e&units=e&startDate=20150101&endDate=20150131这是我想要的数据。下面的表格显示了风速:
https://www.wunderground.com/history/monthly/gb/darlington/EGNV/date/2015-1我的代码非常简单:我首先加载头文件,我的函数get_data得到json格式的响应。
在我的main中,我将数据附加到一个数据帧中并打印出来。
from bs4 import BeautifulSoup
import pandas as pd
import requests
import urllib
from urllib.request import urlopen
headers = {
':authority': 'api.weather.com',
#':path': '/v1/location/EGNV:9:GB/observations/historical.json?apiKey=6532d6454b8aa370768e63d6ba5a832e&units=e&startDate=20150101&endDate=20150131',
':scheme': 'https',
'accept': 'application/json, text/plain, */*',
'accept-encoding' : 'gzip, deflate, br',
'accept-language': 'en-GB,en;q=0.9,en-US;q=0.8,da;q=0.7',
'origin': 'https://www.wunderground.com',
#'apiKey': '6532d6454b8aa370768e63d6ba5a832e',
'referer': 'https://www.wunderground.com/history/monthly/gb/darlington/EGNV/date/2015-1',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'cross-site',
'user-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.87 Safari/537.36'
}
def get_data(response):
df = response.json()
return df
if __name__ == "__main__":
date = pd.datetime.now().strftime("%d-%m-%Y")
api_key = "6532d6454b8aa370768e63d6ba5a832e"
start_date = "20150101"
end_date = "20150131"
urls = [
"https://api.weather.com/v1/location/EGNV:9:GB/observations/historical.json?apiKey="+ api_key +"&units=e&startDate="+start_date+"&endDate="+end_date+""
]
df = pd.DataFrame()
for url in urls:
res = requests.get(url, headers= headers)
data = get_data(res)
df = df.append(data)
print(df)我得到的错误是:
SSLError: HTTPSConnectionPool(host='api.weather.com', port=443): Max retries exceeded with url: /v1/location/EGNV:9:GB/observations/historical.json?apiKey=6532d6454b8aa370768e63d6ba5a832e&units=e&startDate=20150101&endDate=20150131
(Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))更新:即使没有尝试连接应用程序接口,但通过使用BS4抓取页面,我仍然被拒绝访问。不知道为什么,他们怎么能检测到我的刮刀?
发布于 2020-02-06 18:36:10
我解决了。
如果我在requests.get()中添加verify = False,我会设法绕过这个错误。
https://stackoverflow.com/questions/60091893
复制相似问题