我有一台那样的tibble
>dat
# A tibble: 556 × 3
sample run abc
<chr> <chr> <list>
1 206_03_07_2013 21102016 <tibble [304 × 21]>
2 206_04_07_2017 7082017 <tibble [229 × 21]>
3 206_04_10_2015 25112015 <tibble [2,687 × 21]>
4 206_07_08_2013 15102015 <tibble [460 × 21]>
5 206_08_12_2016 3032017 <tibble [3,250 × 21]>
6 206_11_03_2014 21102016 <tibble [975 × 21]>
7 206_13_02_2013 21112016 <tibble [101 × 21]>
8 206_13_03_2013 21112016 <tibble [345 × 21]>
9 206_14_08_2014 8092016 <tibble [1,952 × 21]>
10 206_19_03_2015 25012016 <tibble [11 × 21]>
# ... with 546 more rowsabc列包含不同长度的tibble。我想使用它们的长度(>100行)来过滤dat tibble。
我可以这样做:
dat[sapply(dat$abs,nrow)>100,]但我想用dplyr phylosophy?有什么想法吗?
谢谢
发布于 2017-08-18 18:46:22
一种方法可以是:
library(dplyr)
library(purrr)
dat <- tribble(
~foo, ~bar,
1, as_tibble(head(iris, 3)),
2, as_tibble(head(iris, 7))
)
# # A tibble: 2 x 2
# foo bar
# <dbl> <list>
# 1 1 <tibble [3 x 5]>
# 2 2 <tibble [7 x 5]>
res <- filter(dat, map_int(bar, nrow) > 5)
# # A tibble: 1 x 2
# foo bar
# <dbl> <list>
# 1 2 <tibble [7 x 5]>
desired_output <- dat[sapply(dat$bar,nrow)>5,]
identical(res, desired_output)
# [1] TRUE与您尝试的相比,这里并没有什么真正的附加价值,这只是使用[和sapply (分别带有filter和map_int )的插件替代的问题。Base R函数并不与所谓的"dplyr哲学“不兼容。如果您指的是使用magrittr管道%>%,那么dat %>% .[sapply(.$bar, nrow) > 5, ]和dat %>% filter(map_int(bar, nrow) > 5)同样可以很好地工作。
注意:我通常更喜欢all.equal而不是identical,但是我不能让它工作:
all.equal(res, desired_output)
# Error in equal_data_frame(target, current, ignore_col_order = ignore_col_order, :
# Can't join on 'bar' x 'bar' because of incompatible types (list / list)https://stackoverflow.com/questions/45752109
复制相似问题