我正在尝试做一件非常简单的事情,那就是使用R中的forcats包来处理因子。我有一个包含一些因素变量的数据框架,其中一个是性别,我只是尝试使用fct_count计算变量的出现次数。该语法在文档中显示为fct_count(f) (这可能更简单!)。
我试着用dplyr的方式,使用管道操作符而不是$语法来访问变量,但似乎行不通。我只是从根本上误解了语法吗?
pid <- c('id1','id2','id3','id4','id5','id6')
gender <- c('Male','Female','Other','Male','Female','Female')
df <- data.frame(pid, gender)
df <- as.tibble(df)
df# A tibble: 6 x 2
pid gender
<chr> <fct>
1 id1 Male
2 id2 Female
3 id3 Other
4 id4 Male
5 id5 Female
6 id6 Female# This throws an error
df %>%
mutate(gender = as.factor(gender)) %>%
fct_count(gender) # Error: `f` must be a factor (or character vector).# This works but doesn't use the nice dplyr select syntax
fct_count(df$gender)
# A tibble: 3 x 2
f n
<fct> <int>
1 Female 3
2 Male 2
3 Other 1我哪里错了?dplyr是新手,对于这样一个愚蠢的问题,我很抱歉,但我似乎在任何地方都找不到基本的示例!
发布于 2020-08-15 01:13:05
fct_count接受一个类型为factor或char的向量,它不会特别注意数据块和数据帧。所以最简单的管道应该是...
library(dplyr)
library(forcats)
df %>%
pull(gender) %>%
fct_count
#> # A tibble: 3 x 2
#> f n
#> <fct> <int>
#> 1 Female 3
#> 2 Male 2
#> 3 Other 1您的数据
pid <- c('id1','id2','id3','id4','id5','id6')
gender <- c('Male','Female','Other','Male','Female','Female')
df <- data.frame(pid, gender)
df <- tibble::as_tibble(df)
df发布于 2020-08-15 00:34:52
您可以只使用group_by和n()
pid <- c('id1','id2','id3','id4','id5','id6')
gender <- c('Male','Female','Other','Male','Female','Female')
df <- data.frame(pid, gender)
df <- tibble::tibble(df)
df %>%
dplyr::group_by(gender) %>%
dplyr::summarise(cnt_gender = n()) %>%
dplyr::ungroup()https://stackoverflow.com/questions/63416616
复制相似问题