我想一组一组地从数据帧中取样行。但这里有个问题,我想根据另一个表中的数据对不同数量的记录进行采样。这是我可复制的数据:
df <- data_frame(
Stratum = rep(c("High","Medium","Low"), 10),
id = c(1:30),
Value = runif(30)
)
sampleGuide <- data_frame(
Stratum = c("High","Medium","Low"),
Surveys = c(3,2,5)
)输出应该如下所示:
# A tibble: 10 × 2
Stratum Value
<chr> <dbl>
1 High 0.21504972
2 High 0.71069005
3 High 0.09286843
4 Medium 0.52553056
5 Medium 0.06682459
6 Low 0.38793128
7 Low 0.01285081
8 Low 0.87865734
9 Low 0.09100829
10 Low 0.14851919这是我不工作的尝试
> df %>%
+ left_join(sampleGuide, by = "Stratum") %>%
+ group_by(Stratum) %>%
+ sample_n(unique(Surveys))
Error in unique(Surveys) : object 'Surveys' not found也是
> df %>%
+ group_by(Stratum) %>%
+ nest() %>%
+ left_join(sampleGuide, by = "Stratum") %>%
+ mutate(sample = map(., ~ sample_n(data, Surveys)))
Error in mutate_impl(.data, dots) :
Don't know how to sample from objects of class functionsample_n似乎要求size是一个单一的数字。有什么想法吗?
我只是在寻找tidyverse解决方案。purrr的额外积分!
This是一个类似的问题,但我不满意被接受的答案,因为IRL --我所处理的阶层数量很大。
发布于 2017-01-16 03:56:26
用map2()从purrr算出来的
df %>%
nest(-Stratum) %>%
left_join(sampleGuide, by = "Stratum") %>%
mutate(Sample = map2(data, Surveys, sample_n)) %>%
unnest(Sample)https://stackoverflow.com/questions/41666714
复制相似问题