Stackoverflow的各位好,
我想了解使用Pearson的查询。
什么是nom和denom
什么是r1: r1和r2: r2
我不明白什么是r.r1.rating和r.r2.rating。
这个查询应该推荐其他用户评分的电影。
MATCH (u1:User {id: 3})-[r:RATED]->(m:Movie)
WITH u1, avg(r.rating) AS u1_mean
MATCH (u1)-[r1:RATED]->(m:Movie)<-[r2:RATED]-(u2)
WITH u1, u1_mean, u2, COLLECT({r1: r1, r2: r2}) AS ratings WHERE size(ratings) > 10
MATCH (u2)-[r:RATED]->(m:Movie)
WITH u1, u1_mean, u2, avg(r.rating) AS u2_mean, ratings
UNWIND ratings AS r
WITH sum( (r.r1.rating-u1_mean) * (r.r2.rating-u2_mean) ) AS nom,
sqrt( sum( (r.r1.rating - u1_mean)^2) * sum( (r.r2.rating - u2_mean) ^2)) AS denom,
u1, u2 WHERE denom <> 0
WITH u1, u2, nom/denom AS pearson
ORDER BY pearson DESC LIMIT 10
MATCH (u2)-[r:RATED]->(m:Movie) WHERE NOT EXISTS( (u1)-[:RATED]->(m) )
RETURN m.name, SUM( pearson * r.rating) AS score
ORDER BY score DESC LIMIT 25输出如下:
"m.name“│"score”│
│《西雅图夜未眠》│25.859451877376813│
│“隧道”│22.652532472101605│
│“甲壳虫果汁”│22.21835919736008│
│“尖叫如果你知道什么.."│21.935357890253528│
│《亡灵黎明》│21.421377433824798│
│《曾达的囚徒》│21.225502683325033│
│《天才雷普利先生》│20.83938743140176│
任何建议都会很有帮助。
发布于 2021-01-27 00:34:09
因此,皮尔逊的公式如下所示:https://en.wikipedia.org/wiki/Pearson_correlation_coefficient#For_a_sample
nom只是该公式的分子,在这里定义为:"WITH sum( (r.r1.rating-u1_mean) * (r.r2.rating-u2_mean) ) AS nom,“
同样,denom也是分母。
我对其他两个问题不太清楚,但希望这能有所帮助!
https://stackoverflow.com/questions/65809879
复制相似问题