我有这样一个数据集:
> df<-data.frame(gender=c(rep("male",3),rep("female",3)),
Age=c(rep("old",3),rep("young",3)),VAR=c(rep(1:3),rep(1:3)),
FEN1=c(21,26,29,30,6,11),FEN2=c(14,55,12,33,9,21),
FEN3=c(88,23,55,23,14,66))其中FEN1、FEN2和FEN3包含属于该群体并具有VAR、性别、年龄、FEN列特征的总人数。
我需要将其更改为一个数据框架,其中每一行都属于一个人(总计536行),具有VAR、性别、年龄等列的特征。
预期产出将包括:
我试着用这样的代码手工完成这个任务:
> df2<-as.data.frame(1:536)
> FEN <- c(rep("FEN1",123), rep("FEN2",144), rep("FEN3",269))
> df2$FEN<-FEN
> Gender<-c(rep("male",...)...但很明显,这样做根本没有效率。
发布于 2017-03-24 19:27:12
这里有一个使用基本R方法的方法。
# get the vector names that are used to repeat
fenCats <- tail(names(df), 3)
# construct a list of data.frames where the rows have been repeated
# one data.frame for each of the FEN variables
temp <- Map(function(x) df[rep(seq_len(nrow(df)), x), 1:3], df[fenCats])
# combine list of data.frames and add column with FEN categories
dfNew <- cbind(do.call(rbind, temp),
"fenCats"=rep(fenCats, colSums(df[fenCats])))我们可以验证行计数是否正确。
nrow(dfNew) == sum(colSums(df[fenCats])) &
nrow(dfNew) == sum(rowSums(df[fenCats]))
[1] TRUE作为附加的验证,我们还可以通过使用子设置和cumsum提取每个组的第一行来执行快速验证。
dfNew[cumsum(unlist(df[,fenCats])),]
gender Age VAR fenCats
FEN1.1.20 male old 1 FEN1
FEN1.2.25 male old 2 FEN1
FEN1.3.28 male old 3 FEN1
FEN1.4.29 female young 1 FEN1
FEN1.5.5 female young 2 FEN1
FEN1.6.10 female young 3 FEN1
FEN2.1.13 male old 1 FEN2
FEN2.2.54 male old 2 FEN2
FEN2.3.11 male old 3 FEN2
FEN2.4.32 female young 1 FEN2
FEN2.5.8 female young 2 FEN2
FEN2.6.20 female young 3 FEN2
FEN3.1.87 male old 1 FEN3
FEN3.2.22 male old 2 FEN3
FEN3.3.54 male old 3 FEN3
FEN3.4.22 female young 1 FEN3
FEN3.5.13 female young 2 FEN3
FEN3.6.65 female young 3 FEN3https://stackoverflow.com/questions/43007322
复制相似问题