你好,我有一个数据文件,如:
Species COL1 COL2 COL3 COL4 COL5
SP1 0 0 0 1-2 0-1-2
SP2 1-2 2 0 1 0
SP3 0-1 1 2 0 1-2 我想添加新的列来计算每一行的特定唯一值的数目,例如:
Species COL1 COL2 COL3 COL4 COL5 count_0 count_1-2 count_0-1-2 count_1 count_2
SP1 0 0 0 1-2 0-1-2 3 1 1 0 0
SP2 1-2 2 0 1 0 2 1 0 1 1
SP3 0-1 1 2 0 1-2 1 1 0 2 1有人知道吗?
发布于 2022-12-04 11:54:09
您可以使用熊猫图书馆中的熊猫图书馆方法来计算数据中每一行中每个唯一值的出现次数。
# Loop through each row of the dataframe
for index, row in df.iterrows():
# Create a series object for the current row
series = pd.Series(row)
# Count the number of occurrences of each unique value in the row
counts = series.value_counts()
# Add the count values to the current row of the dataframe
df.loc[index, 'count_0'] = counts[0] if 0 in counts else 0
df.loc[index, 'count_1-2'] = counts['1-2'] if '1-2' in counts else 0
df.loc[index, 'count_0-1-2'] = counts['0-1-2'] if '0-1-2' in counts else 0
df.loc[index, 'count_1'] = counts[1] if 1 in counts else 0
df.loc[index, 'count_2'] = counts[2] if 2 in counts else 0发布于 2022-12-04 12:15:58
示例
data = {'Species': {0: 'SP1', 1: 'SP2', 2: 'SP3'},
'COL1': {0: '0', 1: '1-2', 2: '0-1'},
'COL2': {0: '0', 1: '2', 2: '1'},
'COL3': {0: '0', 1: '0', 2: '2'},
'COL4': {0: '1-2', 1: '1', 2: '0'},
'COL5': {0: '0-1-2', 1: '0', 2: '1-2'}}
df = pd.DataFrame(data)码
df1 = (df.set_index('Species').apply(lambda x: x.value_counts(), axis=1)
.add_prefix('count_').fillna(0).astype('int'))df1
count_0 count_0-1 count_0-1-2 count_1 count_1-2 count_2
Species
SP1 3 0 1 0 1 0
SP2 2 0 0 1 1 1
SP3 1 1 0 1 1 1使期望的输出
concat & df1
pd.concat([df.set_index('Species'), df1], axis=1)https://stackoverflow.com/questions/74675276
复制相似问题