文章/答案/技术大牛

发布

社区首页 >问答首页 >从输出的文本文件中删除重复行所需的正确python代码

问从输出的文本文件中删除重复行所需的正确python代码
EN

Stack Overflow用户

提问于 2021-06-18 17:11:47

回答 1查看 37关注 0票数 0

我从一个文本文件中获取数据，将其转换为一个元组，然后处理元素(使用for循环)，并生成一个输出文本文件。

输出数据是正确的，除了它有重复的、三重的和多拷贝的行。我的输入文本文件看起来像这样的Data for input is taken from this text doc here

This is the intended output txt file

this is my output text

我的代码如下`

myList=[]
with open("data_2.txt") as f: 
    for line in f:
        myList.append(tuple(line.rstrip().split()))
dic = {}
for index, ele in enumerate(myList):
    key=index+1
    val_2=float(ele[1])
    val_3=float(ele[2])
    dic.update({key: (ele[0],val_2,val_3)})
    for i in range(0,len(dic)):
            power= 5//(val_2)
            P=pow(0.5,power)
            cal_grams=val_3*P
            if cal_grams<100:
                outfile = open("Element_Shortage_List.txt", "a")           
                outfile.write(str(ele[0])+ "   " + str(cal_grams) + "\n")
                with open("Element_Shortage_List.txt", 'r') as linremove:
                    words = set(linremove.read().split())
                outfile.close()

上面的代码生成重复的、三重的、在某些情况下甚至是多行的文本。代码中应该删除重复行的部分是

with open("Element_Shortage_List.txt", 'r') as linremove:
                words = set(linremove.read().split())

我试着把outfile = open("Element_Shortage_List.txt", "a")写成outfile = open("Element_Shortage_List.txt", "w")，但是没有给我重复的东西，而是只给了我一行(目标输出列表中的最后一行)

有人知道删除多副本行的正确代码吗？

text-files

python

duplicates

output

回答 1

Stack Overflow用户

发布于 2021-06-18 17:32:07

有一个巧妙的技巧(你正在以某种方式使用)可以从列表中删除重复项。也就是说，首先将其转换为集合，然后再将其转换为列表。它是这样的：arr = list( set( arr ) )。所以我可以想到两种方法来做这件事。

首先，在输出文件中编写您想要的所有内容。再次读取输出文件，并将每行添加到一个集合中。然后，在输出文件中重写集合。伪代码：

## When you are done writing your output file

my_output_file = open('output file address' , 'r+')
reader = my_output_file.readlines()
reader = list( set(reader) ) 
my_output_file.write(reader)
my_output_file.close()

其次，您可以将输出缓存到列表中(而不是一次写一行)，将缓存的输出转换为集合，然后将其写入输出文件中。伪代码：

output_cache = []
with open('address') as file:
   ## when you are done processing the file, instead of writing it
   ## to the output file, cache it
   output_cache.append(my_processed_line)

output_cache = list(set(output_cache))
file = open('address of output' , 'w')
file.write(output_cache)
file.close()

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/68032440

复制

相似问题

问从输出的文本文件中删除重复行所需的正确python代码
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从输出的文本文件中删除重复行所需的正确python代码EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从输出的文本文件中删除重复行所需的正确python代码
EN