首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何只替换后面有一些字符的\n

如何只替换后面有一些字符的\n
EN

Stack Overflow用户
提问于 2020-06-17 06:00:08
回答 5查看 303关注 0票数 2

我使用pdfminer将pdf转换成txt。问题是pdfminer在pdf中添加\n一个接一个的结尾,但是句子没有结束。你可以看到每一行在下面的文本中被当作一个句子,这是不正确的。我还提供了其他版本的文本,以显示哪里是新的行字符。例如

代码语言:javascript
复制
quan-
tum population.

应该在一个句子里。所以我把\n替换成“这个问题就解决了。但是其他的\n也是我不想替换的。

代码语言:javascript
复制
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)

Muhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer Beg

Abstract
With advancement in Quantum computing, classical algorithms are adapted and integrated
with Quantum properties such as qubit representation and entanglement. Although these
properties perform better however pre-mature convergence is the main issue in Quantum
Evolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-
tum population. In this paper, we introduced a new way to update the quantum population
of QEA to avoid premature convergence

'Balanced Quantum Classical Evolutionary Algorithm(BQCEA)\n\nMuhammad Shahid, Hasan Mujtaba, 
Muhammad Asim, Omer Beg\n\nAbstract\nWith advancement in Quantum computing, classical 
algorithms are adapted and integrated\nwith Quantum properties such as qubit representation 
and entanglement', ' Although these\nproperties perform better however pre-mature 
convergence is the main issue in Quantum\nEvolutionary Algorithms(QEA) because QEA uses only 
the best individual to update quan-\ntum population', ' In this paper, we introduced a new 
way to update the quantum population\nof QEA to avoid premature convergence',

我试过这个密码。

代码语言:javascript
复制
lines =tokenize.sent_tokenize(txt_str)
for l in lines:
    s = l.replace('\n', '')
    print(s)

这就导致了这一点。

代码语言:javascript
复制
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)Muhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer BegAbstractWith advancement in Quantum computing, classical algorithms are adapted and integratedwith Quantum properties such as qubit representation and entanglement.
Although theseproperties perform better however pre-mature convergence is the main issue in QuantumEvolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-tum population.
In this paper, we introduced a new way to update the quantum populationof QEA to avoid premature convergence.

但这不是通缉令。我要这个版本的短信。

代码语言:javascript
复制
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)

Muhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer Beg

Abstract
With advancement in Quantum computing, classical algorithms are adapted and integrated with Quantum properties such as qubit representation and entanglement. Although these properties perform better however pre-mature convergence is the main issue in Quantum Evolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-tum population. In this paper, we introduced a new way to update the quantum population of QEA to avoid premature convergence

我不想让空话消失。我希望你能理解。

EN

回答 5

Stack Overflow用户

回答已采纳

发布于 2020-06-17 06:13:24

代码语言:javascript
复制
(?<=\S)(?<!\bAbstract)\n(?=\S)

您可以尝试this.See演示。

https://regex101.com/r/crj3aD/1

Python脚本:

代码语言:javascript
复制
inp = "Balanced Quantum Classical Evolutionary Algorithm(BQCEA)\n\nMuhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer Beg\n\nAbstract\nWith advancement in Quantum computing, classical algorithms are adapted and integrated\nwith Quantum properties such as qubit representation and entanglement', ' Although these\nproperties perform better however pre-mature convergence is the main issue in Quantum\nEvolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-\ntum population', ' In this paper, we introduced a new way to update the quantum population\nof QEA to avoid premature convergence"

output = re.sub(r'(?<=\S)(?<!\bAbstract)\n(?=\S)', ' ', inp)
print(output)

还有更多的条件。

代码语言:javascript
复制
(?<=\S)(?<!\bAbstract)(?:\n|\\n)(?=\S)

试试看你的另一种情况。

https://regex101.com/r/crj3aD/2

票数 2
EN

Stack Overflow用户

发布于 2020-06-17 06:30:11

lines = tokenize.sent_tokenize(txt_str)

S= lines.replace('\n',')

印刷品

票数 0
EN

Stack Overflow用户

发布于 2020-06-17 06:34:38

这应该是为你做的:

代码语言:javascript
复制
import re
pattern = re.compile(r"^(.*\(BQCEA\))(.*Beg)(Abstract)(With.*)", re.DOTALL)

try:
    with open('sample.txt', 'r') as f:
        line = f.read()
        # remove some unwanted characters
        r = line.replace('\\n', "").replace("'", "").replace("\n", "")
        print(r)
        for match in re.finditer(pattern, r):
            print(match.group(1))
            print('\n')
            print(match.group(2))
            print('\n')
            print(match.group(3))
            print(match.group(4))
except Exception as er:
    print(er)

输出:

代码语言:javascript
复制
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)


Muhammad Shahid, Hasan Mujtaba,Muhammad Asim, Omer Beg


Abstract
With advancement in Quantum computing, classicalalgorithms are adapted and integratedwith Quantum properties such as qubit representationand entanglement,  Although theseproperties perform better however pre-matureconvergence is the main issue in QuantumEvolutionary Algorithms(QEA) because QEA uses onlythe best individual to update quan-tum population,  In this paper, we introduced a newway to update the quantum populationof QEA to avoid premature convergence

示例:

代码语言:javascript
复制
'Balanced Quantum Classical Evolutionary Algorithm(BQCEA)\n\nMuhammad Shahid, Hasan Mujtaba,
Muhammad Asim, Omer Beg\n\nAbstract\nWith advancement in Quantum computing, classical
algorithms are adapted and integrated\nwith Quantum properties such as qubit representation
and entanglement', ' Although these\nproperties perform better however pre-mature
convergence is the main issue in Quantum\nEvolutionary Algorithms(QEA) because QEA uses only
the best individual to update quan-\ntum population', ' In this paper, we introduced a new
way to update the quantum population\nof QEA to avoid premature convergence'
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62422383

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档