我有一个经典形式的时间序列(问题末尾的dput(data,30)输出)
Region Country Channel Customer Family Product Pack Date Quantity
A a 1 Z A a 1 2011-11-01 1000
...
A a 1 Z A a 1 2014-11-01 2000
A a 1 Z A a 2 2011-11-01 1000
...
A a 1 Z A a 2 2014-11-01 1000
A a 1 Z A b 1 2011-11-01 1000
...
A a 1 Z A b 2 2014-11-01 1000
...
...
D g 4 P D q 4 2011-11-01 1000
...
D g 4 P D q 4 2014-11-01 1000我正在努力寻找一种有效的方法来将这些数据放入gts/hts所需的结构中,即
Date Aa1ZAa1 Aa1ZAa2 Aa1ZAa3 ... Aa1ZAb1 Aa1ZBa1 ... ... Dg4PDq4
2011-11-01 1000 1000 234 654 354345 1234
...
2014-11-01 2000 1000 345 3454 345443 334我目前使用的是迭代,这显然相当慢。
我还有一个问题,并不是所有的系列都有相同的长度,因为一些新产品已经推出,或者现有的产品已经销售到新的国家/渠道。
非常感谢大家的帮助。
向Trevor致敬
structure(list(Customer = c("bci", "bci", "bci", "bci", "bci",
"bci", "bci", "bci", "bci", "bci", "bci", "bci", "bci", "bci",
"bci", "bci", "bci", "bci", "bci", "bci", "bci", "bci", "bci",
"bci", "bci", "bci", "bci", "bci", "bci", "bci"), Site = c("SOP030",
"SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030",
"SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030",
"SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030",
"SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030", "SOP030",
"SOP030"), Item = c("System", "System", "System", "System", "System",
"System", "System", "System", "System", "System", "System", "System",
"System", "System", "System", "System", "System", "System", "System",
"System", "System", "System", "System", "System", "System", "System",
"System", "System", "System", "System"), Part = c("Gamer", "Gamer",
"Gamer", "Gamer", "Gamer", "Gamer", "Gamer", "Gamer", "Gamer",
"Gamer", "Gamer", "Gamer", "Gamer", "Gamer", "Gamer", "Server",
"Server", "Server", "Server", "Server", "Server", "Server", "Server",
"Server", "Server", "Server", "Server", "Server", "Server", "Server"
), Date = structure(c(15765, 15796, 15826, 15859, 15887, 15918,
15950, 15979, 16010, 16041, 16071, 16104, 16132, 16161, 16191,
15765, 15796, 15826, 15859, 15887, 15918, 15950, 15979, 16010,
16041, 16071, 16104, 16132, 16161, 16191), class = "Date"), Qty = c(735,
0, 0, 665, 0, 693, 735, 770, 784, 805, 777, 763, 728, 749, 714,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("Customer",
"Site", "Item", "Part", "Date", "Qty"), row.names = c(65L, 66L,
67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 77L, 78L, 79L,
94L, 95L, 96L, 97L, 98L, 99L, 100L, 101L, 102L, 103L, 104L, 105L,
106L, 107L, 108L), class = "data.frame")发布于 2014-11-30 19:43:23
如果df是数据集,则可以尝试
library(reshape2)
df1 <- data.frame(Multcol=as.character(interaction(df[,1:4]),sep=''),
df[,5:6], stringsAsFactors=FALSE)
res <- dcast(df1, Date~Multcol, value.var='Qty')
head(res,3)
# Date bci.SOP030.System.Gamer bci.SOP030.System.Server
#1 2013-03-01 735 0
#2 2013-04-01 0 0
#3 2013-05-01 0 0https://stackoverflow.com/questions/27203495
复制相似问题