我尝试使用requests库下载几个pdfs,并使用pypdf将它们合并在一起。一般来说,这是很好的工作,但对于某些pdfs,我只是得到一个错误。
MWE.py
import requests
from pyPdf import PdfFileWriter, PdfFileReader
from StringIO import StringIO
input = PdfFileReader(StringIO(response.content))
input.decrypt("")
output = PdfFileWriter()
output.addPage(input.getPage(0))
outputStream = file("document-output.pdf", "wb")
output.write(outputStream)
outputStream.close()
session.close()错误
Traceback (most recent call last):
File "mwe.py", line 21, in <module>
input.decrypt("")
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 894, in decrypt
return self._decrypt(password)
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 904, in _decrypt
user_password, key = self._authenticateUserPassword(password)
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 945, in _authenticateUserPassword
encrypt.get("/EncryptMetadata", BooleanObject(False)).getObject())
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1818, in _alg35
key = _alg32(password, rev, keylen, owner_entry, p_entry, id1_entry)
File "/usr/local/lib/python2.7/dist-packages/pyPdf/pdf.py", line 1729, in _alg32
m.update(id1_entry)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)对于跟踪,我从一个文件中读取输入,但在本例中我认为这并不重要。
我发现了一些与这个问题有关的问题,但我无法解决我的具体问题。
发布于 2016-06-14 22:38:07
好的,我发现这似乎是pyPdf (1.13) https://github.com/mstamy2/PyPDF2/issues/51中的一个bug
相反,使用PyPDF2 (1.26.0)可以像预期的那样工作。
https://stackoverflow.com/questions/37822887
复制相似问题