我会用移动的窗口来比较它们之间的一组值。我试着用一种更好的方式来解释:我有一篇关于熊猫数据的专栏,我会测试一个序列中的5行是否相同,但是我会在一个移动的窗口中做这个检查,也就是说,我会比较从0到5的行,然后从1行到6行,等等,为了做某些更改。我想知道如何才能以比我的更好的方式来做这件事,因为我使用了iterrow方法。
我的方法:
for idx, row in df[2:-2].iterrows():
previous2 = df.loc[idx-2, 'speed_limit']
previous1 = df.loc[idx-1, 'speed_limit']
now = row['speed_limit']
next1 = df.loc[idx+1, 'speed_limit']
next2 = df.loc[idx+2, 'speed_limit']
if (next1==next2) & (previous1 == previous2) & (previous1 == next1) & (now!=previous1):
df.at[idx, 'speed_limit'] = previous1感谢您的耐心。如有任何建议,我将不胜感激。祝你今天过得愉快。
发布于 2022-10-06 12:56:22
我想发布我的解决方案,基于numpy选择和熊猫转移,这是比以前更快。
def noise_remove(df, speed_limit_column):
speed_data_column = df[speed_limit_column]
previous_1 = speed_data_column.shift(-1)
previous_2 = speed_data_column.shift(-2)
next_1 = speed_data_column.shift(+1)
next_2 = speed_data_column.shift(+2)
conditions = [(previous_1 == previous_2) &
(next_1 == next_2) &
(previous_1 == next_1) &
(speed_data_column == previous_1),
(previous_1 == previous_2) &
(next_1 == next_2) &
(previous_1 == next_1) &
(speed_data_column != previous_1)]
choices = [speed_data_column, previous_1]
df[speed_limit_column] = np.select(conditions, choices, default=speed_data_column)
return df如果你有什么建议,我会很感激的。祝你今天愉快!
https://stackoverflow.com/questions/73957399
复制相似问题