我正在尝试做的是循环通过一个网址列表下载一系列的.pdfs,并将它们保存到一个.zip。目前,我只是尝试使用一个URL测试代码。我得到的错误是:
Traceback (most recent call last):
File "I:\test_pdf_download_zip.py", line 36, in <module>
zip_file(zipfile_name, url)
File "I:\test_pdf_download_zip.py", line 30, in zip_file
myzip.write(dowload_pdf(url))
TypeError: expected a string or other character buffer object有没有人知道如何正确地将.pdf请求传递给.zip (避免上面的错误),以便我附加它,或者知道是否可以这样做?
import os
import zipfile
import requests
output = r"I:"
# File name of the zipfile
zipfile_name = os.path.join(output, "test.zip")
# Random test pdf
url = r"http://www.pdf995.com/samples/pdf.pdf"
def create_zipfile(zipfile_name):
zipfile.ZipFile(zipfile_name, "w")
def dowload_pdf(url):
response = requests.get(url, stream=True)
with open('test.pdf', 'wb') as f:
f.write(response.content)
def zip_file(zip_name, url):
with open(zip_name,'a') as myzip:
myzip.write(dowload_pdf(url))
if __name__ == "__main__":
create_zipfile(zipfile_name)
zip_file(zipfile_name, url)
print("Done")发布于 2016-09-22 06:28:03
您的download_pdf()函数正在保存一个文件,但它不返回任何内容。您需要对其进行修改,使其真正将文件路径返回到myzip.write()。您不希望硬编码test.pdf,而是将唯一路径传递给下载函数,这样就不会在归档文件中出现多个test.pdf。
def dowload_pdf(url, path):
response = requests.get(url, stream=True)
with open(path, 'wb') as f:
f.write(response.content)
return pathhttps://stackoverflow.com/questions/39627036
复制相似问题