我有一个大型的dataframe,其中包含两种类型值的12列,分别是Rested和Active。我想把每个月的列转换成行,从而把所有月份的列(一月,二月,三月)。)“月”以下
我的数据如下:
ID L1 L2 Year JR FR MR AR MYR JR JLR AGR SR OR NR DR JA FA MA AA MYA JA JLA AGA SA OA NA DA
1234 89 65 2003 11 34 6 7 8 90 65 54 3 22 55 66 76 86 30 76 43 67 13 98 67 0 127 74
1234 45 76 2004 67 87 98 5 4 3 77 8 99 76 56 4 3 2 65 78 44 53 67 98 79 53 23 65我试图使它看起来如下(列R代表休息,A列代表活动。每月JR、FR、MR分别指简休息、2月休息、Mar休息,JA、FA、MA分别指简休息、2月活动、Mar活动等):
因此,在这里,我试图将每个月列转换为行,并通过创建一个新的月份列来保持它们彼此之间的R和A值。
ID L1 L2 Year Month R A
1234 89 65 2003 Jan 11 76
1234 89 65 2003 Feb 34 86
1234 89 65 2003 Mar 6 30
1234 89 65 2003 Apr 7 76
1234 89 65 2003 May 8 43
1234 89 65 2003 Jun 90 67
1234 89 65 2003 Jul 65 13
1234 89 65 2003 Aug 54 98
1234 89 65 2003 Sep 3 67
1234 89 65 2003 Oct 22 0
1234 89 65 2003 Nov 55 127
1234 89 65 2003 Dec 66 74
1234 45 76 2004 Jan 67 3
1234 45 76 2004 Feb 87 2
1234 45 76 2004 Mar 98 65
1234 45 76 2004 Apr 5 78
1234 45 76 2004 May 4 44
1234 45 76 2004 Jun 3 53
1234 45 76 2004 Jul 77 67
1234 45 76 2004 Aug 8 98
1234 45 76 2004 Sep 99 79
1234 45 76 2004 Oct 76 53
1234 45 76 2004 Nov 56 23
1234 45 76 2004 Dec 4 65我试过很多东西,比如stack,melt,unlist
data_reshape <- reshape(df,direction="long", varying=list(c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA")), v.names="Precipitation", timevar="Month")
data_stacked <- stack(data, select = c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA"))但是他们的结果并不完全是预期的--他们给出了所有年份的Jan值,然后给出了所有年份的Feb值,然后给出了所有年份的三月值,等等。但是,我想对整个数据集中存在的每个ID,以适当的每月方式构造它们。
如何在R中实现这一点?
发布于 2015-06-08 09:15:19
下面是一种基本的重塑方法:
res <- reshape(mydf, direction="long", varying=list(5:16, 17:28), v.names=c("R", "A"), times = month.name, timevar = "Month")
res[with(res, order(ID, -L1, L2, Year)), -8]发布于 2015-06-08 07:41:33
下面是使用发展版本 of data.table的一种可能的解决方案
library(data.table) ## v >= 1.9.5
res <- melt(setDT(df),
id = 1:4, ## id variables
measure = list(5:16, 17:ncol(df)), # a list of two groups of measure variables
variable = "Month", # The name of the additional variable
value = c("R", "A")) # The names of the grouped variables
setorder(res, ID, -L1, L2, Year) ## Reordering the data to match the desired output
res[, Month := month.abb[Month]] ## You don't really need this part as you already have the months numbers
# ID L1 L2 Year Month R A
# 1: 1234 89 65 2003 Jan 11 76
# 2: 1234 89 65 2003 Feb 34 86
# 3: 1234 89 65 2003 Mar 6 30
# 4: 1234 89 65 2003 Apr 7 76
# 5: 1234 89 65 2003 May 8 43
# 6: 1234 89 65 2003 Jun 90 67
# 7: 1234 89 65 2003 Jul 65 13
# 8: 1234 89 65 2003 Aug 54 98
# 9: 1234 89 65 2003 Sep 3 67
# 10: 1234 89 65 2003 Oct 22 0
# 11: 1234 89 65 2003 Nov 55 127
# 12: 1234 89 65 2003 Dec 66 74
# 13: 1234 45 76 2004 Jan 67 3
# 14: 1234 45 76 2004 Feb 87 2
# 15: 1234 45 76 2004 Mar 98 65
# 16: 1234 45 76 2004 Apr 5 78
# 17: 1234 45 76 2004 May 4 44
# 18: 1234 45 76 2004 Jun 3 53
# 19: 1234 45 76 2004 Jul 77 67
# 20: 1234 45 76 2004 Aug 8 98
# 21: 1234 45 76 2004 Sep 99 79
# 22: 1234 45 76 2004 Oct 76 53
# 23: 1234 45 76 2004 Nov 56 23
# 24: 1234 45 76 2004 Dec 4 65安装说明:
library(devtools)
install_github("Rdatatable/data.table", build_vignettes = FALSE)发布于 2015-06-08 08:50:32
这是一个不雅的解决方案,但我将发布它,只是为了说明在任务不一定需要时,如何使用基本工具而不依赖高级别函数来解决问题。我认为你拥有的工具越多,你就越能正确地处理问题。我们在这里:
#extract the data part
data<-t(as.matrix(df[,5:28]))
#build the data.frame cbinding the needed columns
res<-cbind(df[rep(1:nrow(df),each=12),1:4], #this repeats the first 4 columns 12 times each
Month=month.abb, #the month column
R=as.vector(data[1:12,]), # the R column, obtained from the first 12 rows of data
A=as.vector(data[13:24,])) #as above
rownames(res)<-NULL #just to remove the row nameshttps://stackoverflow.com/questions/30703400
复制相似问题