我尝试使用这里报告的一致性度量计算。
我和quanteda一起工作所以我有一个dfm
但是,在链接中使用dtm:#create DTM
dtm <- CreateDtm(tokens$text,
doc_names = tokens$ID,
ngram_window = c(1, 2))
#explore the basic frequency
tf <- TermDocFreq(dtm = dtm)
original_tf <- tf %>% select(term, term_freq,doc_freq)
rownames(original_tf) <- 1:nrow(original_tf)
# Eliminate words appearing less than 2 times or in more than half of the
# documents
vocabulary <- tf$term[ tf$term_freq > 1 & tf$doc_freq < nrow(dtm) / 2 ]
dtm = dtm如何在此计算中使用dfm选项而不是dtm选项
更具体地说,如何使用dfm和dtm选项创建词汇表? 1:https://towardsdatascience.com/beginners-guide-to-lda-topic-modelling-with-r-e57a5a8e7a25
发布于 2021-04-18 03:26:49
你想要convert()。例如:
convert(yourdfm, to = "topicmodels")或
convert(yourdfm, to = "tm")参见?convert。
https://stackoverflow.com/questions/67141516
复制相似问题