我有一套使用np.genfromtext阅读的纺织品。它们通常是标准格式,每个测得的板一个文本文件,每个板有300个孔。这为我提供了以下标题:
headers =['ID','Diameter','Radius','Xpos','Ypos']
#the data looks like
[1,105,53.002,784.023,91.76],
[2,104,51.552,787.023,91.71],
...
[300,104,51.552,787.023,91.71]现在我有了一套纺织品,而不是一个板子每个孔测量一次,而是测量一个孔两次:
[1,105,53.002,784.023,91.76],
[1,104,53.012,784.024,91.76],
[2,104,51.552,787.023,91.71],
[2,106,51.532,786.823,91.69],
...
[300,104,51.552,787.023,91.71],
[300,104,51.557,785.993,91.6]或者每两个洞中有一个:
[1,105,53.002,784.023,91.76],
[1,104,53.012,784.024,91.76],
[3,104,51.552,787.023,91.71],
[3,106,51.532,786.823,91.69],
...
[300,104,51.552,787.023,91.71],
[300,104,51.557,785.993,91.6]或三个孔中的一个两次:
[1,105,53.002,784.023,91.76],
[1,104,53.012,784.024,91.76],
[4,104,51.552,787.023,91.71],
[4,106,51.532,786.823,91.69],
...
[300,104,51.552,787.023,91.71],
[300,104,51.557,785.993,91.6]我想要的是一种方法,在每一行的第一个值,' ID‘,并在此基础上,能够采取多少行具有相同的ID的平均值,然后继续我的代码的其余部分来分析结果。
这是我通常读取1/1数据的方式:
dataA=np.genfromtxt(fname,dtype=float, delimiter='\t', names=True)如果每种纺织品都有重复的行或第二个测量值,这条线就可以很好地工作:
lines = open( 'filename.txt', "r" ).readlines()[::2]关于如何获得一个唯一的数组作为输出而不复制ID,理想情况下,具有相同ID的行的平均值,但唯一的行可能就足够了,你有什么想法吗?
发布于 2019-07-23 22:36:51
你可以使用下面的代码。这不会取平均值,但您可以去掉重复的索引值。
import numpy as np
a = np.array([[2,8,3,1], [3,2,3,3], [5,3,2,1], [1,4,2,3], [3,6,3,4], [2,4,5,6], [4,1,1,1]])
a[np.unique(a[:,0],return_index=True,axis=0)[1]]https://stackoverflow.com/questions/57165939
复制相似问题