# A tibble: 10 x 1
a
<dbl>
1 1.
2 2.
3 3.
4 0.
5 5.
6 0.
7 7.
8 0.
9 0.
10 0.如果您查看column a,您会注意到第七行是值大于0(正)的最后一行。我如何使R找到这第七行?
换句话说,我希望筛选以包含第1-7行,但排除7之后的所有行(即第8-10行),因为7是最后一个具有正值的行。这是让我们开始的老虎。
library(tidyverse)
df <- tibble(a = c(1, 2, 3, 0, 5, 0, 7, 0, 0, 0)) %>% print()发布于 2018-11-08 15:16:18
一种简洁的方法是
df[1:max(which(df$a>0)),]
# A tibble: 7 x 1
# a
# <dbl>
# 1 1
# 2 2
# 3 3
# 4 0
# 5 5
# 6 0
# 7 7也是
df[1:which.max(cumsum(df$a)),]
head(df,1-which.max(rev(df$a)>0))
df[rev(cumsum(rev(df$a>0)))>0,]让我们来做一个比较长的df$a,比较所有的方法:
df <- data.frame(a = rbinom(5000, 2, 0.2) - 1)
microbenchmark(
df[1:max(which(df$a>0)),],
df[1:which.max(cumsum(df$a)),],
head(df,1-which.max(rev(df$a)>0)),
df[rev(cumsum(rev(df$a>0)))>0,],
df[1:tail(which(sign(df$a) == 1), 1),],
times = 10000
)
# Unit: microseconds
# expr min lq mean median uq max neval cld
# df[1:max(which(df$a > 0)), ] 52.817 58.5800 102.80519 62.2160 71.5910 17108.65 10000 a
# df[1:which.max(cumsum(df$a)), ] 36.190 40.7620 65.68274 43.0785 49.7835 18827.08 10000 a
# head(df, 1 - which.max(rev(df$a) > 0)) 214.812 230.7590 355.37321 249.1085 297.4340 18158.22 10000 c
# df[rev(cumsum(rev(df$a > 0))) > 0, ] 106.391 114.6345 192.44990 124.4690 141.5650 14473.12 10000 b
# df[1:tail(which(sign(df$a) == 1), 1), ] 106.152 116.8985 207.69863 125.6520 150.3425 195384.36 10000 b https://stackoverflow.com/questions/53210344
复制相似问题