我用Python编写了一段代码,将现有的文本文件(.txt)复制到同一位置的新文件(具有不同的名称)。这将按预期从原始文本文件复制所有文本:
a=open("file1.txt", "r") #existing file
b=open("file2.txt", "w") #file did not previously exist, hence "w"
for reform1 in a.readlines():
b.write(reform1) #write the lines from 'reform1'
reform1=a.readlines() #read the lines in the file
a.close() #close file a (file1)
b.close() #close file b (file2)我现在被要求修改新的文件,从复制的文件中删除重复的行和空行(同时保留原稿),并保留其余的文本(唯一的行)。怎么做?
发布于 2016-11-30 16:56:31
这将向'file2.txt'写入'file1.txt'中的所有行,除了那些仅由空格组成或重复的行。该命令被保留,但假定只有重复的第一个实例才应该被写入:
seen = set()
with open('file1.txt') as f, open('file2.txt','w') as o:
for line in f:
if not line.isspace() and not line in seen:
o.write(line)
seen.add(line)注str.isspace()是所有空格(例如制表符)的True,而不仅仅是换行符,使用if not line == '\n'进行更严格的定义(假设没有'/r'换行符)。
我使用with语句处理文件的打开/关闭,并逐行读取文件,这是最重要的仿生方式。
对于在Python中复制文件,您应该使用shutil,如解释的here。
发布于 2016-11-30 16:47:43
试试这个:
import re
a=open("file1.txt", "r") #existing file
b=open("file2.txt", "w") #file did not previously exist, hence "w"
exists = set()
for reform1 in a.readlines():
if reform1 in exists:
continue
elif re.match(r'^\s$', reform1):
continue
else:
b.write(reform1) #write the lines from 'reform1'
exists.add(reform1)
a.close() #close file a (file1)
b.close() #close file b (file2)发布于 2016-11-30 17:00:41
尝试:
a=open("file1.txt", "r") #existing file
b=open("file2.txt", "w") #file did not previously exist, hence "w"
seen = []
for reform1 in a.readlines():
if reform1 not in seen and len(reform1) > 1:
b.write(reform1) #write the lines from 'reform1'
seen.append(reform1)
a.close() #close file a (file1)
b.close() #close file b (file2)我使用"len(reform1) > 1“,因为当我创建测试文件时,空行有一个字符,大概是"\r”或"\n“字符。根据需要对您的应用程序进行调整。
https://stackoverflow.com/questions/40893689
复制相似问题