文章/答案/技术大牛

发布

社区首页 >问答首页 >读取PDF文件python - pypdf2时出现断言错误

问读取PDF文件python - pypdf2时出现断言错误
EN

Stack Overflow用户

提问于 2020-05-21 16:11:51

回答 1查看 319关注 0票数 0

当我尝试读取PDF文件时，出现以下错误。

代码：

from PyPDF2 import PdfFileReader
import os

os.chdir("Path to dir")

pdf_document = 'sample.pdf'
pdf = PdfFileReader(pdf_document,'rb') #Error here

错误：

Traceback (most recent call last):
File "/home/krishna/PycharmProjects/sample/sample.py", line 9, in
pdf = PdfFileReader(filehandle)
File "/home/krishna/PycharmProjects/AI_DRC/venv/lib/python3.6/site-packages/PyPDF2/pdf.py", line 1084, in init
self.read(stream)
File "/home/krishna/PycharmProjects/AI_DRC/venv/lib/python3.6/site-packages/PyPDF2/pdf.py", line 1838, in read
assert start >= last_end
AssertionError

注意:文件大小为18 MB

python

pdf

python-3.6

pypdf2

回答 1

Stack Overflow用户

发布于 2020-05-21 16:33:59

我在这里写了这篇文章，它完全适用于我，pdf在同一个文件夹中，你也可以使用os来获取字符串类型的路径值。

import PyPDF2

pdf_file = PyPDF2.PdfFileReader("Sample.pdf")#addressing the file, you can use os method it works on that as well

page_content = pdf_file.getPage(0).extractText()# here I get the psge number one(index zero) and then extracted its content

print(page_content)#you can then do whatever you want with it

我认为你的程序的问题在于"rb“的东西，你在常规的文件处理中使用它，PyPDF2已经有了名为PdfFileReader，PdfFileWriter和PdfFileMerger的方法。希望这对你解决任何问题有所帮助，我会尽力回复的。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/61930075

复制

相似问题

问读取PDF文件python - pypdf2时出现断言错误
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问读取PDF文件python - pypdf2时出现断言错误EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问读取PDF文件python - pypdf2时出现断言错误
EN