我训练了LDA模型来聚类100个主题,据我所知,每个主题都应该有一定的概率输出,所有的结果加起来都是1。
但是当我运行这段代码时,我只得到两个主题。
请帮帮忙。
text = "A blood cell, also called a hematocyte, is a cell produced by hematopoiesis and normally found in blood."
# transform text into the bag-of-words space
bow_vector = dictionary.doc2bow(tokenize(text))
lda_vector = lda_model[bow_vector]
print("LDA Output: ", lda_vector)
print("\nTop Keywords from highest prob Topic: ",lda_model.print_topic(max(lda_vector, key=lambda item: item[1])[0]))
print("\n\nAddition of all the probabilities from LDA output:",functools.reduce(lambda x,y:x+y,[i[1] for i in lda_vector]))LDA输出:(64,0.6952628),(69,0.18223721) 0.042*“健康”+0.032*“医疗”+0.017*“病人”+0.016*“癌症”+0.015*“医院”+0.015*说“+0.015*”治疗+0.012*“医生”+0.012*“护理”+0.012*“药物” 从LDA输出中添加所有概率: 0.8775
发布于 2019-02-28 00:19:49
如果将参数minimum_probability of LdaModel设置为0,则之和将为1 (或由于逼近误差而接近1 )。它控制过滤文档返回的主题。
https://stackoverflow.com/questions/54823225
复制相似问题