文章/答案/技术大牛

发布

社区首页 >问答首页 >pdfplumber给fp.seek(pos) AttributeError：'dict‘对象没有'seek’属性

问pdfplumber给fp.seek(pos) AttributeError：'dict‘对象没有'seek’属性
EN

Stack Overflow用户

提问于 2020-09-22 15:19:12

回答 1查看 392关注 0票数 0

所以这是我的代码：

def main():    
    import combinedparser as cp
    from tkinter.filedialog import askopenfilenames

    files = askopenfilenames()
    print(files) #this gives the right files as a list of strings composed of path+filename


    def file_discriminator(func):
        def wrapper():
            results = []
            for item in files:
                if item.endswith('.pdf'):
                    print(item + 'is pdf')
                    func = f1(file = item)
                    results.append(item, Specimen_Output)
                else:
                    print(item + 'is text')
                    func = f2(file = item)
                    results.append(item, Specimen_Output)

        return wrapper


    @file_discriminator
    def parse_me(**functions):
        print(results)


    parse_me(f1 = cp.advparser(), f2 = cp.vikparser())

main()

其中combinedparser.py有两个函数：

def advparser(**file):
    import pdfplumber
    with pdfplumber.open(file) as pdf:  # opened fname and assigned it to the variable pdf
        page = pdf.pages[0]  # assigned index 0 of pages to the variable page
        text = page.extract_words()
    #followed by a series of python operations generating a dict named Specimen_Output
def vikparser(**file):
    with open(file, mode = 'r') as filename:
        Specimen_Output = {}
    #followed by a series of python operations generating a dict named Specimen_Output

我有一个随机散布着pdf和文本文件的目录。我正在尝试使用装饰器@file_discriminator来运行函数advparser，它使用pdfplumber和后续处理从pdf文件中提取有用的信息，对目录中的pdf文件进行处理；vikparser对文本文件执行常规的文本文件处理。每个都应该生成一个名为Specimen_Output的字典。当advparser是一个单独的.py文件，作为advparser(文件)运行，导入askopenfilename而不是它的复数，并使用advparser(file =askopenfilename())调用时，我得到了正确的结果；vikparser (它查看带有读取行的文本文件)也是如此。但是，当我尝试从主模块执行此操作并使用父函数调用它们时，我无法使其正常工作。我尝试了几乎所有我调用它们的地方的可能的排列，以及对'file‘使用位置vs关键字参数。

当我修复我通过改变周围的东西而产生的任何bug时，这是我得到的最常见的错误：

Traceback (most recent call last):


 File "<input>", line 1, in <module>
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/main.py", line 29, in <module>
    parse_me(f1 = cp.advparser(), f2 = cp.vikparser())
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/combinedparser.py", line 12, in advparser
    with pdfplumber.open(file) as pdf:  # opened fname and assigned it to the variable pdf
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/venv/lib/python3.8/site-packages/pdfplumber/pdf.py", line 48, in open
    return cls(path_or_fp, **kwargs)
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/venv/lib/python3.8/site-packages/pdfplumber/pdf.py", line 25, in __init__
    self.doc = PDFDocument(PDFParser(stream), password=password)
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/venv/lib/python3.8/site-packages/pdfminer/pdfparser.py", line 39, in __init__
    PSStackParser.__init__(self, fp)
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/venv/lib/python3.8/site-packages/pdfminer/psparser.py", line 502, in __init__
    PSBaseParser.__init__(self, fp)
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/venv/lib/python3.8/site-packages/pdfminer/psparser.py", line 172, in __init__
    self.seek(0)
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/venv/lib/python3.8/site-packages/pdfminer/psparser.py", line 514, in seek
    PSBaseParser.seek(self, pos)
  File "/Users/zachthomasadmin/PycharmProjects/pythonProject1/venv/lib/python3.8/site-packages/pdfminer/psparser.py", line 202, in seek
    self.fp.seek(pos)
AttributeError: 'dict' object has no attribute 'seek'

我做错了什么？它谈论的是什么dict对象，为什么当我尝试从askopenfilename()中单独调用每种类型时，pdfplumber没有出现这个问题？我是一个编程新手，整天都在为此而焦头烂额。谢谢!

python

python-3.x

pdf

python-decorators

python-pdfreader

回答 1

Stack Overflow用户

发布于 2020-09-22 15:50:39

问题是advparser和vikparser函数中的file参数实际上是一个命名参数字典，因为它是用两个星号定义的。所以当你以这种方式调用这些函数时

func = f1(file = item)

advparser或vikparser函数中的file参数实际上等于{"file": "some_filename.pdf"}。

你需要解包你的参数：

def vikparser(**file):
    with open(file["file"], mode='r') as filename:
        pass

或者只在函数定义中使用单个file参数：

def vikparser(file):
    with open(file, mode='r') as filename:
        pass

票数 -1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64004821

复制

相似问题

问pdfplumber给fp.seek(pos) AttributeError：'dict‘对象没有'seek’属性
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问pdfplumber给fp.seek(pos) AttributeError：'dict‘对象没有'seek’属性EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问pdfplumber给fp.seek(pos) AttributeError：'dict‘对象没有'seek’属性
EN