我有如下数据集:
Age Monday Tuesday Wednesday
6-9 a b c
6-9 b a c
6-9 b c a
9-10 c c b
9-10 c a b我想要找出a,b,c在不同年龄组中的总频率,使用R如下:
Age a b c
6-9 3 3 3
9-10 1 2 3发布于 2019-11-13 14:39:16
我们可以获得更长格式的数据,count它们,并再次转换为宽格式。
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = -Age) %>%
count(Age, value) %>%
pivot_wider(names_from = value, values_from = n)
# Age a b c
# <fct> <int> <int> <int>
#1 6-9 3 3 3
#2 9-10 1 2 3data
df <- structure(list(Age = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("6-9",
"9-10"), class = "factor"), Monday = structure(c(1L, 2L, 2L,
3L, 3L), .Label = c("a", "b", "c"), class = "factor"), Tuesday = structure(c(2L,
1L, 3L, 3L, 1L), .Label = c("a", "b", "c"), class = "factor"),
Wednesday = structure(c(3L, 3L, 1L, 2L, 2L), .Label = c("a",
"b", "c"), class = "factor")), class = "data.frame", row.names = c(NA, -5L))发布于 2019-11-13 17:45:50
给定输入数据df为
df <- structure(list(Age = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("6-9",
"9-10"), class = "factor"), Monday = structure(c(1L, 2L, 2L,
3L, 3L), .Label = c("a", "b", "c"), class = "factor"), Tuesday = structure(c(2L,
1L, 3L, 3L, 1L), .Label = c("a", "b", "c"), class = "factor"),
Wednesday = structure(c(3L, 3L, 1L, 2L, 2L), .Label = c("a",
"b", "c"), class = "factor")), class = "data.frame", row.names = c(NA, -5L))然后,如果您计划使用base R,以下内容可能会有所帮助
# make a list of categorized by Age
lst <- split(df,df$Age)
# combine the list of data frame
zlst <- do.call(rbind,sapply(seq_along(lst), function(k) cbind(data.frame(Age = names(lst)[k]), t(as.data.frame.factor(table(unlist(lst[[k]][,-1]))))),simplify = F))
# rename the row names
rownames(zlst) <- seq(nrow(zlst))这最终给出了:
> zlst
Age a b c
1 6-9 3 3 3
2 9-10 1 2 3发布于 2019-11-14 01:59:54
我们可以只使用base R中的table
table(rep(df$Age, 3), unlist(df[-1]))
# a b c
# 6-9 3 3 3
# 9-10 1 2 3数据
df <- structure(list(Age = structure(c(1L, 1L, 1L, 2L, 2L), .Label = c("6-9",
"9-10"), class = "factor"), Monday = structure(c(1L, 2L, 2L,
3L, 3L), .Label = c("a", "b", "c"), class = "factor"), Tuesday = structure(c(2L,
1L, 3L, 3L, 1L), .Label = c("a", "b", "c"), class = "factor"),
Wednesday = structure(c(3L, 3L, 1L, 2L, 2L), .Label = c("a",
"b", "c"), class = "factor")), class = "data.frame", row.names = c(NA, -5L))https://stackoverflow.com/questions/58831542
复制相似问题