文章/答案/技术大牛

发布

社区首页 >问答首页 >从一个dataframe获取值，将操作循环到另一个dataframe

问从一个dataframe获取值，将操作循环到另一个dataframe
EN

Stack Overflow用户

提问于 2019-02-22 13:30:31

回答 1查看 40关注 0票数 1

vocab
 wordIDx V1
    1  archive
    2  name
    3  atheism
    4  resources
    5  alt

wordIDx newsgroup_ID    docIdx  word/doc    totalwords/doc totalwords/newsgroup wordID/newsgroup    P(W_j)
1   1   196 3   1240    47821   2   0.028130269
1   1   47  2   1220    47821   2   0.028130269
2   12  4437    1   702 47490   8   0.8
3   12  4434    1   673 47490   8   0.035051912
5   12  4398    1   53  47490   8   0.4
3   12  4564    11  1539    47490   8   0.035051912

对于wordIDx中的每个wordIDx，我需要计算以下公式:例如，wordIDx=1；我的值应该是

max(log(0.02813027)+sum(log(2/47821),log(2/47821)))
= -23.73506

我现在有以下代码：

 classifier_3$ans<- max(log(classifier_3$`P(W_j)`)+ (sum(log(classifier_3$`wordID/newsgroup`/classifier_3$`totalwords/newsgroup`))))

我如何才能以这样一种方式循环:它考虑来自vocab的所有wordIDx并计算上面的例子，正如我突出显示的那样。

loops

dataframe

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-02-22 14:02:04

就像这样，但是你真的需要清理你的列名。

vocab <- read.table(text = "wordIDx V1
1  archive
2  name
3  atheism
4  resources
5  alt", header = TRUE, stringsAsFactors = FALSE)

classifier_3 <- read.table(text = "wordIDx newsgroup_ID    docIdx  word/doc            totalwords/doc totalwords/newsgroup wordID/newsgroup    P(W_j)
1   1   196 3   1240    47821   2   0.028130269
1   1   47  2   1220    47821   2   0.028130269
2   12  4437    1   702 47490   8   0.8
3   12  4434    1   673 47490   8   0.035051912
5   12  4398    1   53  47490   8   0.4
3   12  4564    11  1539    47490   8   0.035051912", header = TRUE, stringsAsFactors = FALSE)

classifier_3 <- classifier_3[!duplicated(classifier_3$wordIDx), ]
classifier_3 <- merge(vocab, classifier_3, by = c("wordIDx"))
classifier_3$ans<- pmax(log(classifier_3$`P.W_j.`)+ 
                     (log(classifier_3$`wordID.newsgroup`/classifier_3$`totalwords.newsgroup`) +
                            # isn't that times 2?
                            log(classifier_3$`wordID.newsgroup`/classifier_3$`totalwords.newsgroup`)),
                        log(classifier_3$`wordID.newsgroup`/classifier_3$`totalwords.newsgroup`))

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/54820629

复制

相似问题

问从一个dataframe获取值，将操作循环到另一个dataframe
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从一个dataframe获取值，将操作循环到另一个dataframeEN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从一个dataframe获取值，将操作循环到另一个dataframe
EN