文章/答案/技术大牛

发布

社区首页 >问答首页 >Python -文本处理

问Python -文本处理
EN

Stack Overflow用户

提问于 2017-04-06 18:36:34

回答 1查看 586关注 0票数 0

我正在尝试使用Python来处理从.pdf中刮来的文本。

我尝试这样做的方法之一是:找到特定的项目并打印同一行、前面的行或下面的行。

我环顾四周，跟读了一些教程，让我到了这个地步，但我想不出如何前进。

下面的代码将使用" find“函数查找和打印当前行中的信息，但我需要能够使用它来打印后面和前面的行。

废弃的文本如下所示：

史密斯，约翰每结束12/12/12 文件： 12345

我使用的代码如下：

def main():
    file = open("Register.txt","r")
    lines = file.readlines()
    file.close
    for line in lines:
        line = line.strip()
        countPerEnd = 0
        countFile = 0
        if line.find("Per End")!=-1:
            countPerEnd = countPerEnd + 1
        if line.find("File:")!=-1:
            countFile = countFile + 1
    print ("Per End: ", countPerEnd)
    print ("File: ", countFile)
main()

我只能得到我必须打印的线条，但需要他们能够精细的其他项目，在这种情况下，名称和号码后面的“文件：”。

因为这可以是除了字符串“每端”和“文件：”之外的任何东西，所以它们总是一样的。

我把结果打印出来，看输出结果在哪里。

产出为:每结束: 12/12/12

我需要的输出是基于查找“每端”：Smith，John

parsing

text

python

python-2.7

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-04-07 13:12:56

我不能百分之百肯定你想做什么，但我认为这应该能让你走上正确的轨道：

lines = open("register.txt", "r").readlines()

search_counters = {
    "Per End": 0,
    "File:": 0,
}

lines = [line for line in lines if line] # removes empty lines, if there are any
for i, line in enumerate(lines):
    for search_key in search_counters.keys():
        if search_key in line:
            search_counters[search_key] += 1
            # print the previous line if the current line contains "Per End":
            if search_key == "Per End":
                print "previous line:", lines[i-1]

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/43263251

复制

相似问题

问Python -文本处理
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python -文本处理EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python -文本处理
EN