我在本教程后面的R中进行分层聚类。
我的代码是这样的,但是它以一个错误结尾:
> distances = dist(movies[2:20], method="euclidean")
> clusterMovies = hclust(distances, method="ward")
> plot(clusterMovies)
Error in plot.hclust(clusterMovies) : 'merge' matrix has invalid contents发布于 2015-11-29 11:14:24
对我来说没问题..。请确保下载movieLens.txt文件时使用了教程的以前的视频中所示的确切方式,即不使用“保存为”和Internet 。那么,这应该是可行的:
movies = read.table("movieLens.txt", header=FALSE, sep="|",quote="\"")
# Add column names
colnames(movies) = c("ID", "Title", "ReleaseDate", "VideoReleaseDate", "IMDB", "Unknown", "Action", "Adventure", "Animation", "Childrens", "Comedy", "Crime", "Documentary", "Drama", "Fantasy", "FilmNoir", "Horror", "Musical", "Mystery", "Romance", "SciFi", "Thriller", "War", "Western")
# Remove unnecessary variables
movies$ID = NULL
movies$ReleaseDate = NULL
movies$VideoReleaseDate = NULL
movies$IMDB = NULL
# Remove duplicates
movies = unique(movies)
# Compute distances
distances = dist(movies[2:20], method = "euclidean")
# Hierarchical clustering
clusterMovies = hclust(distances, method = "ward")
# Plot the dendrogram
plot(clusterMovies)除了一条无害的警告消息外,在clustermovies命令之后:
The "ward" method has been renamed to "ward.D"; note new "ward.D2"https://stackoverflow.com/questions/33981561
复制相似问题