文章/答案/技术大牛

发布

社区首页 >问答首页 >'KMeans‘对象没有属性'labels_’

问'KMeans‘对象没有属性'labels_’
EN

Stack Overflow用户

提问于 2018-04-15 17:38:45

回答 1查看 27K关注 0票数 4

我的代码我使用的是sklearn算法。当我执行代码时，我得到了"'KMeans‘对象没有属性’labels_‘’这样的错误“。

Traceback (most recent call last):
 File ".\kmeans.py", line 56, in <module>
   np.unique(km.labels_, return_counts=True)
AttributeError: 'KMeans' object has no attribute 'labels_'

这是我的代码：

import pandas as pds
import nltk,re,string
from nltk.probability import FreqDist
from collections import defaultdict
from nltk.tokenize import sent_tokenize, word_tokenize, RegexpTokenizer
from nltk.tokenize.punkt import PunktSentenceTokenizer
from nltk.corpus import stopwords
from string import punctuation
from heapq import nlargest
# import and instantiate CountVectorizer
from sklearn.feature_extraction.text import CountVectorizer
vect = CountVectorizer()    
from sklearn.feature_extraction.text import TfidfVectorizer
vectorizer = TfidfVectorizer(ngram_range=(1,2),max_df=0.5, min_df=2,stop_words='english')
train_X = vectorizer.fit_transform(x)  

from sklearn.cluster import KMeans
import sklearn.cluster.k_means_
km = KMeans(n_clusters=3, init='k-means++', max_iter=100, n_init=1, 
  verbose=True)

import numpy as np
np.unique(km.labels_, return_counts=True)

text = {}
for i,cluster in enumerate(km.labels_):
    oneDocument = X[i]     
    if cluster not in text.keys():
        text[cluster] = oneDocument
    else:
        text[cluster] += oneDocument        

_stopwords = set(stopwords.words('english')+ list(punctuation))

keywords = {}
counts = {}

for cluster in range(3):
    word_sent =  word_tokenize(text[cluster].lower())
    word_sent = [word for word in word_sent if word not in _stopwords]
    freq = FreqDist(word_sent)
    keywords[cluster] =  nlargest(100, freq, key=freq.get)
    counts[cluster] = freq

unique_keys={}
for cluster in range(3):
    other_clusters = list(set(range(3))-set([cluster]))
    keys_other_clusters = 
    set(keywords[other_clusters[0]]).union(set(keywords[other_clusters[1]]))
    unique=set(keywords[cluster])-keys_other_clusters
    unique_keys[cluster]= nlargest(100, unique, key=counts[cluster].get)

#print(unique_keys)
print(keywords)

来获取关键词簇。我试图解决这个问题。但我不知道我缺少什么..。

python

machine-learning

scikit-learn

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-04-15 17:46:50

为了使KMeans对象具有label属性，您必须首先对它进行匹配：

如果不配合，它会抛出一个错误：

from sklearn.cluster import KMeans
km = KMeans()
print(km.labels_)
>>>AttributeError: 'KMeans' object has no attribute 'labels_'

试衣后：

from sklearn.cluster import KMeans
import numpy as np
km = KMeans()
X = np.random.rand(100, 2)
km.fit(X)
print(km.labels_)
>>>[1 6 7 4 6 6 7 5 6 0 0 7 3 4 5 7 5 0 3 4 0 6 1 6 7 5 4 3 4 2 1 2 1 4 6 3 6 1 7 6 6 7 4 1 1 0 4 2 5 0 6 3 1 0 7 6 2 7 7 5 2 7 7 3 2 1 2 2 4 7 5 3 2 65 1 6 2 4 2 3 2 2 2 1 2 0 5 7 2 4 4 5 4 4 1 1 4 5 0]

票数 13

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/49844928

复制

相似问题

问'KMeans‘对象没有属性'labels_’
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问'KMeans‘对象没有属性'labels_’EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问'KMeans‘对象没有属性'labels_’
EN