首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何将LDA输出转换为R中的单词主题矩阵?

如何将LDA输出转换为R中的单词主题矩阵?
EN

Stack Overflow用户
提问于 2017-02-16 06:51:46
回答 1查看 811关注 0票数 0
代码语言:javascript
复制
library(tm)
library(topicmodels)
lda_topicmodel <- model_LDA(dtm, k=20, control=list(seed=1234))

如何在R中将其转换为单词-主题矩阵和文档-主题矩阵?

不幸的是,'S4‘类型的对象是不可子集的。因此,我不得不复制数据的一个子集以供使用。

代码语言:javascript
复制
Topic 1     Topic 2   Topic 3   Topic 4    Topic 5     Topic 6    Topic 7         Topic 8    Topic 9      Topic 10    
[1,] "flooding"  "beach"   "sets"    "flooding" "storm"     "fwy"      "storms"        "flooding" "socal"      "rain"      
[2,] "erosion"   "long"    "alltime" "just"     "flooding"  "due"      "thunderstorms" "via"      "major"      "california"
[3,] "cause"     "abc7"    "rain"    "almost"   "years"     "closures" "flash"         "public"   "throughout" "nearly"    
[4,] "emergency" "day"     "slides"  "hardcore" "mudslides" "avoid"    "continue"      "asks"     "abc7"       "southern"  
[5,] "highway"   "history" "last"    "spun"     "snow"      "latest"   "possible"      "call"     "streets"    "storms"  



Topic 11 Topic 12   Topic 13  Topic 14      Topic 15      Topic 16 Topic 17   Topic 18   Topic 19     Topic 20     
[1,] "abc7"   "abc7"     "like"    "widespread"  "widespread"  "across" "rainfall" "flooding" "flooding"   "vehicles"   
[2,] "beach"  "flooding" "closed"  "batters"     "biggest"     "can"    "record"   "region"   "storm"      "several"    
[3,] "long"   "stranded" "live"    "california"  "evacuations" "stay"   "breaks"   "reported" "california" "getting"    
[4,] "fwy"    "county"   "raining" "evacuations" "mudslides"   "home"   "long"     "corona"   "causes"     "floodwaters"
[5,] "710"    "san"      "blog"    "mudslides"   "years"       "wires"  "beach"    "across"   "related"    "stranded" 

图片包含每个主题中单词的子集:LDA word-topic我希望将S4对象的内容写入csv文件,就像单词-主题矩阵一样,如下所示:Word-Topic Matrix

EN

回答 1

Stack Overflow用户

发布于 2017-02-16 12:57:00

我使用了R中的一些数据,因为我们无法复制您的数据。

代码语言:javascript
复制
# load the libraries
library(topicmodels)
library(tm)

# load the data we'll be using
data("AssociatedPress")

# estimate a LDA model using the VEM algorithm (default)
# I'll be using the number of k (number of topics) being 2
# just as a example
ap_lda <- LDA(AssociatedPress, 
              k = 2, 
              control = list(seed = 1234))

# get all the terms in a dataframe 
as.data.frame(terms(ap_lda, dim(ap_lda)[1]))

输出将为:

代码语言:javascript
复制
  Topic 1    Topic 2
1 percent          i
2 million  president
3     new government
4    year     people
5 billion     soviet
6    last        new
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/42261610

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档