假设我有下面的df
test = read.table(text = "total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter
1 -1 1 1 -1 B
1 1 1 -1 1 C
-1 -1 -1 -1 1 A", header = T)我想匹配所有包含"total_score“但不包含"partner”的列,然后创建一个新的度量,将所有"total_score“列相加,将-1视为0。
我可以像这样拿一个基本的rowSum
mutate(net_correct = rowSums(select(., grep("total_score", names(.))))但是,请注意,这并不排除匹配"partner“一词的可能性,我无法了解如何在单个grep命令中进行匹配。
但是,我现在想要创建一个total_correct值,它是同一列上的rowSum,除非-1被视为0。
这将导致类似于这样的data.frame:
total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter total_sum
1 1 -1 1 1 -1 B 2
2 1 1 1 -1 1 C 3
3 -1 -1 -1 -1 1 A 1一种方法可能是只计算"1s“的总数(而不是实际的求和),但我想不出如何在变体命令中这样做。
发布于 2019-11-20 16:28:51
你可以这样做:
test %>%
mutate(net_correct = select(.,setdiff(contains("total_score"), contains("partner"))) %>% replace(., . == -1, 0) %>% rowSums())
# total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4 letter net_correct
#1 1 -1 1 1 -1 B 2
#2 1 1 1 -1 1 C 3
#3 -1 -1 -1 -1 1 A 1发布于 2019-11-20 16:16:30
您可以简单地修改regex,以便只捕获使用插入字符“total_score”开头的列:
mutate(net_correct = rowSums(select(., grep("^total_score", names(.)))))若要将负数处理为零,可以使用mutate_all()
test %>%
mutate(total_correct = rowSums(select(., grep("^total_score", names(.))) %>%
mutate_all(function(x){as.numeric(x>0)})
)
)发布于 2019-11-20 16:29:55
另一种可能是:
test %>%
mutate(net_correct = rowSums(select(., contains("total"), -contains("partner")) %>%
replace(., . == -1, 0)))
total_score_1 total_score_2 partner_total_score_1 total_score_3 total_score_4
1 1 -1 1 1 -1
2 1 1 1 -1 1
3 -1 -1 -1 -1 1
letter net_correct
1 B 2
2 C 3
3 A 1https://stackoverflow.com/questions/58958790
复制相似问题