我想根据LabelID属性计算数据帧中"1“块的数量。例如,给定以下数据帧:
DF输入:
eventTime velocity LabelId
1 2017-08-19 12:53:55.050 3 0
2 2017-08-19 12:53:55.100 4 1
3 2017-08-19 12:53:55.150 180 1
4 2017-08-19 12:53:55.200 2 1
5 2017-08-19 12:53:55.250 5 0
6 2017-08-19 12:53:55.050 3 0
7 2017-08-19 12:53:55.100 4 1
8 2017-08-19 12:53:55.150 70 1
9 2017-08-19 12:53:55.200 2 1
10 2017-08-19 12:53:55.250 5 0Output=2,因为它有两个块1. Block_1=rows 2-4和Block_2=rows 7-9。请提供任何帮助,我们将非常感激。
致以最好的问候,卡洛
发布于 2018-03-05 19:53:08
我们可以使用diff()。如下所示:
d = df.LabelId.diff()
d.iloc[0] = df.LabelId.iloc[0]这为您提供了:
[0, 1, 0, 0, -1, 0, 1, 0, 0, -1]1组的数量是diff为1的次数。因此:
(d == 1).sum()给你答案。
发布于 2018-03-05 19:59:24
下面是另一种简单的方法:
INTERESTING_LABEL = 1
df = ... # Make data frame
# Find positions where the label is not present
s = (df.LabelId != INTERESTING_LABEL)
# Counter that increases where the label is not present
# Then select where the label is present and count unique values
num_blocks = s.cumsum()[~s].nunique()https://stackoverflow.com/questions/49109770
复制相似问题