这段代码就是我在silhouette_score中使用的。在这里,我使用了凝聚聚类,链接作为Ward。我想得到凝聚聚类的“质心”,可以从凝聚聚类得到吗?我只能得到K-均值的质心和模糊c-均值。
df1
Height time_of_day resolution
272 1.567925 1.375000 0.594089
562 1.807508 1.458333 0.594089
585 2.693542 0.416667 0.594089
610 1.036305 1.458333 0.594089
633 1.117111 0.416667 0.594089
658 1.542407 1.458333 0.594089
681 1.930844 0.416667 0.594089
802 1.505548 1.458333 0.594089
808 1.009369 1.708333 0.594089
def clustering(df1):
X = df1.iloc[:].values
range_n_clusters = [2,3,4]
for n_clusters in range_n_clusters:
# Create a subplot with 1 row and 2 columns
clusterer = AgglomerativeClustering(n_clusters=n_clusters, linkage='ward')
clusterer.fit_predict(X)
cluster_labels = clusterer.labels_
silhouette_avg = silhouette_score(X, cluster_labels)
if silhouette_avg > 0.4:
print("For n_clusters =", n_clusters,
"The average silhouette_score is :", silhouette_avg)
fig, (ax1, ax2) = plt.subplots(1, 2)
fig.set_size_inches(15, 5)
ax1.set_xlim([-0.1, 1])
ax1.set_ylim([0, len(X) + (n_clusters + 1) * 10])
sample_silhouette_values = silhouette_samples(X, cluster_labels)
y_lower = 10
for i in range(n_clusters):
ith_cluster_silhouette_values = \
sample_silhouette_values[cluster_labels == i]
ith_cluster_silhouette_values.sort()
size_cluster_i = ith_cluster_silhouette_values.shape[0]
y_upper = y_lower + size_cluster_i
color = cm.nipy_spectral(float(i) / n_clusters)
ax1.fill_betweenx(np.arange(y_lower, y_upper),
0, ith_cluster_silhouette_values,
facecolor=color, edgecolor=color, alpha=0.7)
ax1.text(-0.05, y_lower + 0.5 * size_cluster_i, str(i))
y_lower = y_upper + 10 # 10 for the 0 samples
ax1.set_title("The silhouette plot for the various clusters.")
ax1.set_xlabel("The silhouette coefficient values")
ax1.set_ylabel("Cluster label")
ax1.axvline(x=silhouette_avg, color="red", linestyle="--")
ax1.set_yticks([]) # Clear the yaxis labels / ticks
ax1.set_xticks([-0.1, 0, 0.2, 0.4, 0.6, 0.8, 1])
ax = Axes3D(fig)
colors = cm.nipy_spectral(cluster_labels.astype(float) / n_clusters)
ax.scatter(X[:, 1], X[:, 2], X[:, 0],marker='o', s=20, lw=0, alpha=0.7,
c=colors, edgecolor='k')
plt.suptitle(("Silhouette analysis for HAC-ward clustering on sample data "
"with n_clusters = %d" % n_clusters),
fontsize=14, fontweight='bold')
plt.show()
returnclusterer = AgglomerativeClustering(n_clusters=n_clusters, linkage='ward')
clusterer.fit_predict(X)
cluster_labels = clusterer.labels_此代码仅适用于凝聚聚类方法
from scipy.cluster.hierarchy import centroid, fcluster
from scipy.spatial.distance import pdist
cluster = AgglomerativeClustering(n_clusters=4, affinity='euclidean', linkage='ward')
y = pdist(df1)
y我也试过这个代码,但我不确定'y‘是正确的质心。
from sklearn.neighbors.nearest_centroid import NearestCentroid
clf = NearestCentroid()
clf.fit(df1["Height"],df1["time_of_day"])
print(clf.centroids_)为此,我尝试使用另一种方法来处理X,Y质心。它显示了错误...
请告诉我,我是否可以从凝聚聚类中获得质心,或者我应该坚持模糊聚类
谢谢
发布于 2020-03-12 19:31:44
我相信你可以使用凝聚聚类,你可以使用NearestCentroid获得质心,你只需要在你的代码中做一些调整,以下是对我有效的方法:
from sklearn.neighbors import NearestCentroid
y_predict = clusterer.fit_predict(X)
#...
clf = NearestCentroid()
clf.fit(X, y_predict)
print(clf.centroids_)我认为你的代码中唯一缺少的是,你没有从fit_predict()中得到返回值,你也可以尝试dendrogram,以获得更好的可视化效果,完整的代码可以在here中找到。
https://stackoverflow.com/questions/56456572
复制相似问题