我有一个KMeans函数,它接受输入def kmeans(x,k, no_of_iterations):,并返回下面的return points, centroids,它被完美地绘制出来了,它的代码不是很相关。但是我想为它计算,的剪影得分,并为每个值绘制这个图。
#Load Data
data = load_digits().data
pca = PCA(2)
#Transform the data
df = pca.fit_transform(data)
X= df
#y = kmeans.fit_predict(X)
#Applying our function
label, centroids = kmeans(df,10,1000)#returns points value and centroids
y = label.fit_predict(data)
#Visualize the results
u_labels = np.unique(label)
for i in u_labels:
plt.scatter(df[label == i , 0] , df[label == i , 1] , label = i)
plt.scatter(centroids[:,0] , centroids[:,1] , s = 80, color = 'k')
plt.legend()
plt.show()上面是运行KMeans图的代码,下面的是我计算轮廓的尝试。这是一个从KMeans导入的示例,但我并不真的想这样做,我的代码也不起作用。
silhouette_avg = silhouette_score(X, y)
print("The average silhouette_score is :", silhouette_avg)
# Compute the silhouette scores for each sample
sample_silhouette_values = silhouette_samples(X, y)你可能会注意到,这里没有y的值,因为我发现,y应该是我认为的簇的数量?所以一开始我把它作为10,它给出了一个错误信息。我不知道从这段代码中是否有人能告诉我下一步我要做什么来得到这个值?
发布于 2021-04-01 04:24:49
试试这个:
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
from yellowbrick.cluster import KElbowVisualizer, SilhouetteVisualizer
mpl.rcParams["figure.figsize"] = (9,6)
# Generate synthetic dataset with 8 blobs
X, y = make_blobs(n_samples=1000, n_features=12, centers=8, shuffle=True, random_state=42)
# Instantiate the clustering model and visualizer
model = KMeans()
visualizer = KElbowVisualizer(model, k=(4,12))
visualizer.fit(X) # Fit the data to the visualizer
visualizer.poof()
# Instantiate the clustering model and visualizer
model = KMeans(8)
visualizer = SilhouetteVisualizer(model)
visualizer.fit(X) # Fit the data to the visualizer
visualizer.poof() # Draw/show/poof the data还有,看这个。
https://www.scikit-yb.org/en/latest/api/cluster/silhouette.html
https://stackoverflow.com/questions/66894840
复制相似问题