我有一个包含两列字符串的data.frame,如下所示。
nos <- c("JM1", "JM2", "JM3", "JM1", "JM5", "JM45", "JM3", "JM45")
ren <- c("book, vend, spent", "marigold, fortune", "smoke, parchment, smell, book", "mental, past, create", "key, fortune, mask, federal", "tell, warn, slip", "wire, dg333, uv12", "tell, warn, slip, furniture")
d <- data.frame(nos, ren, stringsAsFactors=FALSE)
d
nos ren
1 JM1 book, vend, spent
2 JM2 marigold, fortune
3 JM3 smoke, parchment, smell, book
4 JM1 mental, past, create
5 JM5 key, fortune, mask, federal
6 JM45 tell, warn, slip
7 JM3 wire, dg333, uv12
8 JM45 tell, warn, slip, furniture我想根据ren列中的字符串连接nos列的元素。
例如,在示例数据中,应该合并发生两次的与JM1相关的元素("book、vend、in、vend、past、create")。
此外,应该合并与JM45相关的元素,只保留唯一的单词。(“告诉、警告、滑动、家具”)
我想得到的输出如下所示。
nos1 <- c("JM1", "JM2", "JM3", "JM5", "JM45")
ren1 <- c("book, vend, spent, mental, past, create", "marigold, fortune", "smoke, parchment, smell, book, wire, dg333, uv12", "key, fortune, mask, federal", "tell, warn, slip, furniture")
out <- data.frame(nos1, ren1, stringsAsFactors=FALSE)
out
nos1 ren1
1 JM1 book, vend, spent, mental, past, create
2 JM2 marigold, fortune
3 JM3 smoke, parchment, smell, book, wire, dg333, uv12
4 JM5 key, fortune, mask, federal
5 JM45 tell, warn, slip, furniture如何在R中做到这一点?我的原始数据集在data.frame中有数千行这样的行。
发布于 2014-05-02 12:15:14
使用plyr包,您可以这样做
ddply(d, .(nos), summarise, ren1=paste0(ren, collapse=", "))或者,如果您希望在ren1中有这样的唯一值
ddply(d, .(nos), summarise,
paste0(unique(unlist(strsplit(ren, split=", "))), collapse=", "))https://stackoverflow.com/questions/23427831
复制相似问题