有人能解释为什么这两个相关矩阵返回不同的结果吗?
library(recommenderlab)
data(MovieLense)
cor_mat <- as( similarity(MovieLense, method = "pearson", which = "items"), "matrix" )
cor_mat_base <- suppressWarnings( cor(as(MovieLense, "matrix"), use = "pairwise.complete.obs") )
print( cor_mat[1:5, 1:5] )
print( cor_mat_base[1:5, 1:5] )发布于 2019-06-19 14:28:10
dissimilarity() = 1 - pmax(cor(), 0) R基函数。另外,为它们指定method以使它们使用相同的一个也很重要:
library("recommenderlab")
data(MovieLense)
cor_mat <- as( dissimilarity(MovieLense, method = "pearson",
which = "items"), "matrix" )
cor_mat_base <- suppressWarnings( cor(as(MovieLense, "matrix"), method = "pearson"
, use = "pairwise.complete.obs") )
print( cor_mat[1:5, 1:5] )
print(1- cor_mat_base[1:5, 1:5] )
> print( cor_mat[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.0000000 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.0000000 0.0000000 1.0000000
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.0000000 0.0000000
> print(1- cor_mat_base[1:5, 1:5] )
Toy Story (1995) GoldenEye (1995) Four Rooms (1995) Get Shorty (1995) Copycat (1995)
Toy Story (1995) 0.0000000 0.7782159 0.8242057 0.8968647 0.6135248
GoldenEye (1995) 0.7782159 0.0000000 0.7694644 0.7554443 0.7824406
Four Rooms (1995) 0.8242057 0.7694644 0.0000000 1.2019687 0.8153877
Get Shorty (1995) 0.8968647 0.7554443 1.2019687 0.0000000 1.2373503
Copycat (1995) 0.6135248 0.7824406 0.8153877 1.2373503 0.0000000为了更好地理解它,请查看两个包的详细信息:)。
OP/ :很重要的一点是要指出,即使是1-dissimilarity和cor之间也有一些值略有不同,cor大于1。这是因为dissimilarity()将下限设置为0(即不返回负数),而且cor()也可以返回大于1的值。
For r <- cor(*, use = "all.obs"), it is now guaranteed that all(abs(r) <= 1).
应该对此进行评估。
https://stackoverflow.com/questions/56669777
复制相似问题