我使用pdfminer将pdf转换成txt。问题是pdfminer在pdf中添加\n一个接一个的结尾,但是句子没有结束。你可以看到每一行在下面的文本中被当作一个句子,这是不正确的。我还提供了其他版本的文本,以显示哪里是新的行字符。例如
quan-
tum population.应该在一个句子里。所以我把\n替换成“这个问题就解决了。但是其他的\n也是我不想替换的。
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)
Muhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer Beg
Abstract
With advancement in Quantum computing, classical algorithms are adapted and integrated
with Quantum properties such as qubit representation and entanglement. Although these
properties perform better however pre-mature convergence is the main issue in Quantum
Evolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-
tum population. In this paper, we introduced a new way to update the quantum population
of QEA to avoid premature convergence
'Balanced Quantum Classical Evolutionary Algorithm(BQCEA)\n\nMuhammad Shahid, Hasan Mujtaba,
Muhammad Asim, Omer Beg\n\nAbstract\nWith advancement in Quantum computing, classical
algorithms are adapted and integrated\nwith Quantum properties such as qubit representation
and entanglement', ' Although these\nproperties perform better however pre-mature
convergence is the main issue in Quantum\nEvolutionary Algorithms(QEA) because QEA uses only
the best individual to update quan-\ntum population', ' In this paper, we introduced a new
way to update the quantum population\nof QEA to avoid premature convergence',我试过这个密码。
lines =tokenize.sent_tokenize(txt_str)
for l in lines:
s = l.replace('\n', '')
print(s)这就导致了这一点。
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)Muhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer BegAbstractWith advancement in Quantum computing, classical algorithms are adapted and integratedwith Quantum properties such as qubit representation and entanglement.
Although theseproperties perform better however pre-mature convergence is the main issue in QuantumEvolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-tum population.
In this paper, we introduced a new way to update the quantum populationof QEA to avoid premature convergence.但这不是通缉令。我要这个版本的短信。
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)
Muhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer Beg
Abstract
With advancement in Quantum computing, classical algorithms are adapted and integrated with Quantum properties such as qubit representation and entanglement. Although these properties perform better however pre-mature convergence is the main issue in Quantum Evolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-tum population. In this paper, we introduced a new way to update the quantum population of QEA to avoid premature convergence我不想让空话消失。我希望你能理解。
发布于 2020-06-17 06:13:24
(?<=\S)(?<!\bAbstract)\n(?=\S)您可以尝试this.See演示。
https://regex101.com/r/crj3aD/1
Python脚本:
inp = "Balanced Quantum Classical Evolutionary Algorithm(BQCEA)\n\nMuhammad Shahid, Hasan Mujtaba, Muhammad Asim, Omer Beg\n\nAbstract\nWith advancement in Quantum computing, classical algorithms are adapted and integrated\nwith Quantum properties such as qubit representation and entanglement', ' Although these\nproperties perform better however pre-mature convergence is the main issue in Quantum\nEvolutionary Algorithms(QEA) because QEA uses only the best individual to update quan-\ntum population', ' In this paper, we introduced a new way to update the quantum population\nof QEA to avoid premature convergence"
output = re.sub(r'(?<=\S)(?<!\bAbstract)\n(?=\S)', ' ', inp)
print(output)还有更多的条件。
(?<=\S)(?<!\bAbstract)(?:\n|\\n)(?=\S)试试看你的另一种情况。
发布于 2020-06-17 06:30:11
lines = tokenize.sent_tokenize(txt_str)
S= lines.replace('\n',')
印刷品
发布于 2020-06-17 06:34:38
这应该是为你做的:
import re
pattern = re.compile(r"^(.*\(BQCEA\))(.*Beg)(Abstract)(With.*)", re.DOTALL)
try:
with open('sample.txt', 'r') as f:
line = f.read()
# remove some unwanted characters
r = line.replace('\\n', "").replace("'", "").replace("\n", "")
print(r)
for match in re.finditer(pattern, r):
print(match.group(1))
print('\n')
print(match.group(2))
print('\n')
print(match.group(3))
print(match.group(4))
except Exception as er:
print(er)输出:
Balanced Quantum Classical Evolutionary Algorithm(BQCEA)
Muhammad Shahid, Hasan Mujtaba,Muhammad Asim, Omer Beg
Abstract
With advancement in Quantum computing, classicalalgorithms are adapted and integratedwith Quantum properties such as qubit representationand entanglement, Although theseproperties perform better however pre-matureconvergence is the main issue in QuantumEvolutionary Algorithms(QEA) because QEA uses onlythe best individual to update quan-tum population, In this paper, we introduced a newway to update the quantum populationof QEA to avoid premature convergence示例:
'Balanced Quantum Classical Evolutionary Algorithm(BQCEA)\n\nMuhammad Shahid, Hasan Mujtaba,
Muhammad Asim, Omer Beg\n\nAbstract\nWith advancement in Quantum computing, classical
algorithms are adapted and integrated\nwith Quantum properties such as qubit representation
and entanglement', ' Although these\nproperties perform better however pre-mature
convergence is the main issue in Quantum\nEvolutionary Algorithms(QEA) because QEA uses only
the best individual to update quan-\ntum population', ' In this paper, we introduced a new
way to update the quantum population\nof QEA to avoid premature convergence'https://stackoverflow.com/questions/62422383
复制相似问题