我有一个数据框架,如下所示:
master_bill_no category
SBA5100008 CONDOMS
SBA5100008 HAND CREAM
SBA5100009 PREGNANCY TESTS
SBA5100010 MULTI VITAMINS & MIN
SBA5100010 CALCIUM PREPARATIONS
SBA5100010 VITAMINS
SBA5100010 BETABLOCKERS以下是一个可重复的例子:
structure(list(master_bill_no = c("SBA5100008", "SBA5100008",
"SBA5100009", "SBA5100010", "SBA5100010", "SBA5100010", "SBA5100010"
), category = c("CONDOMS", "HAND CREAM", "PREGNANCY TESTS", "MULTI VITAMINS & MIN",
"CALCIUM PREPARATIONS", "VITAMINS", "BETABLOCKERS")), .Names = c("master_bill_no",
"category"), class = "data.frame", row.names = c(NA, -7L))对于每一个独特的主账单不,我试图重塑专栏类别的宽。
例如,所需的输出是:
master_bill_no category
SBA5100008 CONDOMS,HAND CREAM
SBA5100009 PREGNANCY TESTS
SBA5100010 MULTI VITAMINS & MIN,CALCIUM PREPARATIONS,CALCIUM PREPARATIONS,BETABLOCKERS我使用了基本的重塑公式,它只是删除了类别列。
reshape(df, idvar = "master_bill_no", timevar = "category", direction = "wide")我尝试了聚合函数:
aggregate(df, master_bill_no, FUN = paste(category, sep = ","))这将返回一个错误消息“对象类别未找到”。
我相信这样做的原因是为了寻找价值填补,这是缺失的。有人能帮忙吗?
发布于 2016-04-10 08:39:11
imho -最好使用基本函数,如聚合:正确的语法应该是:
aggregate(df$category, by=list(df$master_bill_no), FUN = paste)
( the field , list of 'group by' , the fun to operate on field )
>df
master_bill_no category
1 SBA5100008 CONDOMS
2 SBA5100008 HAND CREAM
3 SBA5100009 PREGNANCY TESTS
4 SBA5100010 MULTI VITAMINS & MIN
5 SBA5100010 CALCIUM PREPARATIONS
6 SBA5100010 VITAMINS
7 SBA5100010 BETABLOCKERS
> aggregate(df$category, by=list(df$master_bill_no), FUN = paste)
Group.1 x
1 SBA5100008 CONDOMS, HAND CREAM
2 SBA5100009 PREGNANCY TESTS
3 SBA5100010 MULTI VITAMINS & MIN, CALCIUM PREPARATIONS, VITAMINS, BETABLOCKERShttps://stackoverflow.com/questions/36527518
复制相似问题