我有一个玩具例子来解释我想要做的事情:
aski = data.frame(x=c("a","b","c","a","d","d"),y=c("b","a","d","a","b","c"))
我设法为y列分配了唯一的ids,现在的输出如下所示:
aski2 = data.frame(x=c("a","b","c","a","d","d"),y=c("1","2","3","2","1","4"))
正如您所看到的,在col和y中都存在"b“,我们在col中分配了一个id=1,而在col中分配了一个带有id=2的"a”等等。如你所见,这些值也存在于x中.col x的第一个元素是"a“。”a“也在col中,并分配了一个id=2,所以我将为一个in col分配一个id=2,现在我要做的是在col中查找这些值,如果它发生在col中,我会将这个id分配给它。
像一样的
aski3 = data.frame(x=c("2","1","4","2","3","3"),y=c("1","2","3","2","1","4"))
发布于 2017-10-07 13:55:21
首先,将两列转换为字符向量。然后,收集两列的所有唯一值,作为一个因子的级别使用。
将两列转换为因素,然后转换为数字。
aski = data.frame(x=c("a","b","c","a","d","d"),y=c("b","a","d","a","b","c"))
aski$x <- as.character(aski$x)
aski$y <- as.character(aski$y)
lev <- unique(c(aski$y, aski$x))
aski$x <- factor(aski$x, levels=lev)
aski$y <- factor(aski$y, levels=lev)
aski$x <- as.numeric(aski$x)
aski$y <- as.numeric(aski$y)
aski发布于 2017-10-07 13:54:14
不需要创建aski2作为中间程序,一个可能的解决方案是使用match和lapply来获取字母的数字表示:
# create a vector of the unique values in the order
# in which you want them assigned to '1' till '4'
v <- unique(aski$y)
# convert both columns to integer values with 'match' and 'lapply'
aski[] <- lapply(aski, match, v)这意味着:
阿斯基x 1 2 1 2 1 2 3 4 4 2 2 5 3 6 3 4
如果要将数字作为字符,则还可以执行以下操作:
aski[] <- lapply(aski, as.character)发布于 2017-10-07 14:50:24
来自dplyr的解决方案。首先,我们可以通过vec创建一个向量,将索引和字母之间的关系用unique(aski$y)表示。在这一步之后,您可以使用Jaap的lapply解决方案,也可以使用来自dplyr的mutata_all,如下所示。
# Create the vector showing the relationship of index and letter
vec <- unique(aski$y)
# View vec
vec
[1] "b" "a" "d" "c"
library(dplyr)
# Modify all columns
aski2 <- aski %>% mutate_all(funs(match(., vec)))
# View the results
aski2
x y
1 2 1
2 1 2
3 4 3
4 2 2
5 3 1
6 3 4数据
aski <- data.frame(x = c("a","b","c","a","d","d"),
y = c("b","a","d","a","b","c"),
stringsAsFactors = FALSE)https://stackoverflow.com/questions/46620755
复制相似问题