我是Python的新手。以下是我的环境设置:
我有蟒蛇3( Python 3)。我希望能够从以下网站下载CSV文件:https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD
我想使用requests库。如果能帮助我弄清楚如何使用requests库将CSV文件下载到我机器上的本地目录,我将不胜感激
发布于 2015-10-19 08:44:25
建议以流的形式下载数据,并刷新到目标或中间本地文件中。
import requests
def download_file(url, output_file, compressed=True):
"""
compressed: enable response compression support
"""
# NOTE the stream=True parameter. It enable a more optimized and buffer support for data loading.
headers = {}
if compressed:
headers["Accept-Encoding"] = "gzip"
r = requests.get(url, headers=headers, stream=True)
with open(output_file, 'wb') as f: #open as block write.
for chunk in r.iter_content(chunk_size=4096):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
f.flush() #Afterall, force data flush into output file (optional)
return output_file考虑到原文:
remote_csv = "https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOAD"
local_output_file = "test.csv"
download_file(remote_csv, local_output_file)
#Check file content, just for test purposes:
print(open(local_output_file).read())基础代码摘自这篇文章:https://stackoverflow.com/a/16696317/176765
在这里,您可以通过请求lib获得更多关于主体流使用情况的详细信息:
http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow
https://stackoverflow.com/questions/33204944
复制相似问题