Hello正在寻找示例python for K-当我有超过6个feutres的数据集时。谢谢
发布于 2019-02-20 11:00:42
你想做什么还不够清楚。如果我理解正确,你想要训练一个K-均值聚类,并可视化结果。但是,您的数据集中有8个维度,显然,您不能绘制这样的空间。
你能做的就是减少二维的维度,然后创建这个图。
例如,
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
# read my data with pandas into a dataframe
data = pd.read_csv("data.csv")
# run a KMeans model with 3 clusters. Change that number to what you want
clustering_kmeans = KMeans(n_clusters=3, precompute_distances="auto", n_jobs=-1)
clusters = clustering_kmeans.fit_predict(data)
# run PCA to reduce the dimensionality to 2 dimensions
reduced_data = PCA(n_components=2).fit_transform(data)
# create a new dataframe that contains the 2 dimensions and the cluster label
results = pd.DataFrame(reduced_data,columns=['pca1','pca2'])
results['label'] = clusters
# plot the results with a scatterplot
sns.scatterplot(x="pca1", y="pca2", hue=label, data=reduced_data)
plt.show()https://datascience.stackexchange.com/questions/45839
复制相似问题