我有一份数据显示了调查的结果,每个问题有一栏。只有5个不同的满意度水平(VDSAT,DSAT,NTL,SAT,VSAT),因此每一栏都是一组满意度水平。
我想要一个总结data.frame,它对每一列,我得到了多少不同的满意度(如果没有发生的话,放0)。每一列都是一个问题,每一行都有满意的程度,每一行的交叉都是计数。
我尝试过使用apply(df,2,table),它会给我一个表列表,然后用data.frame(matrix(unlist(table),nrow=5))将其转换为data.frame格式。
如果我对每个专栏的每个满意程度至少有一个结果的话,这种方法效果很好。如果一列没有"DSAT",则结果的data.frame是不正确的,因为表命令无法从apply中识别缺少的值。
基本上,输出将如下所示:
Q1 Q2 Q3 Q4 Q5 Satisfaction
12 16 22 24 23 Very dissatisfied
27 30 33 24 33 Dissatisfied
49 36 33 30 32 Neutral
6 11 17 25 22 Satisfied
22 23 11 13 6 Very satisfied非常感谢
致以亲切的问候,
编辑:原始数据示例:
Q1 Q2 Q3 Q4 Q5
Very satisfied Very satisfied Very satisfied Very satisfied Very satisfied
Satisfied Dissatisfied Very dissatisfied Dissatisfied Satisfied
Very satisfied Very satisfied Very satisfied Very satisfied Very satisfied
Satisfied Satisfied Satisfied Satisfied Satisfied
Very satisfied Very satisfied Very satisfied Very satisfied Very satisfied
... ... ... ... ...发布于 2015-08-27 12:54:26
下面是一种使用dplyr和tidyr的方法。这个想法是重塑你的数据从宽到长的格式,计数每一个答案和问题,并把数据传回一个宽的格式。
library(tidyr)
library(dplyr)
gather(dat, Question, Answer) %>%
count(Question, Answer) %>%
spread(Question, n, fill = 0L)
#Source: local data frame [5 x 6]
#
# Answer Q1 Q2 Q3 Q4 Q5
#1 DSAT 1 0 1 3 0
#2 NTL 0 0 0 1 2
#3 SAT 0 1 0 0 1
#4 VDSAT 1 3 2 0 1
#5 VSAT 2 0 1 0 0我使用的样本数据:
set.seed(12)
dat <- as.data.frame(matrix(sample(c("VDSAT","DSAT","NTL","SAT","VSAT"), 20, TRUE), ncol = 5))
dat[] <- lapply(dat, factor, levels = c("VDSAT","DSAT","NTL","SAT","VSAT"))
names(dat) <- paste0("Q", 1:5)https://stackoverflow.com/questions/32249504
复制相似问题