首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >根据条件计算各行的值

根据条件计算各行的值
EN

Stack Overflow用户
提问于 2020-08-29 06:07:18
回答 1查看 26关注 0票数 0

我正在寻找在R中跨列计算“YES”的帮助--最好是寻找一个“整洁”的解决方案。

我有一个数据集df_help,需要创建一个新的变量,它根据对象dim_1求值并计算匹配的总数,在df_help_reprex中表示为dim_1

是否有dplyr解决方案,或者使用apply函数作为函数是更好的方法?

谢谢!

代码语言:javascript
复制
> df_help_reprex <- df_help %>% 
+   mutate(dim_1 = c(1, 0, 2, 0, 0, 0, 0, 1, 2, 0))
> df_help
# A tibble: 10 x 8
   symp_ams symp_nvd symp_pain symp_fever vitals_gcs vitals_rr_10_24 vitals_temp_38 vitals_hr_100
   <fct>    <fct>    <fct>     <fct>      <fct>      <fct>           <fct>          <fct>        
 1 NO       YES      NO        NO         NO         NO              NO             YES          
 2 NO       NO       NO        NO         NO         NO              NO             NO           
 3 YES      NO       NO        NO         YES        NO              UNK            YES          
 4 NO       NO       NO        NO         NO         NO              UNK            YES          
 5 NO       NO       NO        YES        YES        NO              YES            NO           
 6 NO       NO       NO        NO         NO         NO              NO             NO           
 7 NO       NO       NO        YES        NO         NO              NO             NO           
 8 NO       YES      NO        NO         NO         NO              NO             NO           
 9 YES      NO       NO        NO         YES        NO              NO             YES          
10 NO       NO       NO        YES        NO         YES             YES            YES          
> dim_1
[1] "symp_ams"   "symp_nvd"   "symp_pain"  "vitals_gcs"
> df_help_reprex
# A tibble: 10 x 9
   symp_ams symp_nvd symp_pain symp_fever vitals_gcs vitals_rr_10_24 vitals_temp_38 vitals_hr_100 dim_1
   <fct>    <fct>    <fct>     <fct>      <fct>      <fct>           <fct>          <fct>         <dbl>
 1 NO       YES      NO        NO         NO         NO              NO             YES               1
 2 NO       NO       NO        NO         NO         NO              NO             NO                0
 3 YES      NO       NO        NO         YES        NO              UNK            YES               2
 4 NO       NO       NO        NO         NO         NO              UNK            YES               0
 5 NO       NO       NO        YES        YES        NO              YES            NO                0
 6 NO       NO       NO        NO         NO         NO              NO             NO                0
 7 NO       NO       NO        YES        NO         NO              NO             NO                0
 8 NO       YES      NO        NO         NO         NO              NO             NO                1
 9 YES      NO       NO        NO         YES        NO              NO             YES               2
10 NO       NO       NO        YES        NO         YES             YES            YES               0
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-08-29 07:42:37

我建议使用tidyverse方法重塑数据,然后计算匹配值的数量。代码如下:

代码语言:javascript
复制
library(tidyverse)
#Data
df_help <- structure(list(symp_ams = c("NO", "NO", "YES", "NO", "NO", "NO", 
"NO", "NO", "YES", "NO"), symp_nvd = c("YES", "NO", "NO", "NO", 
"NO", "NO", "NO", "YES", "NO", "NO"), symp_pain = c("NO", "NO", 
"NO", "NO", "NO", "NO", "NO", "NO", "NO", "NO"), symp_fever = c("NO", 
"NO", "NO", "NO", "YES", "NO", "YES", "NO", "NO", "YES"), vitals_gcs = c("NO", 
"NO", "YES", "NO", "YES", "NO", "NO", "NO", "YES", "NO"), vitals_rr_10_24 = c("NO", 
"NO", "NO", "NO", "NO", "NO", "NO", "NO", "NO", "YES"), vitals_temp_38 = c("NO", 
"NO", "UNK", "UNK", "YES", "NO", "NO", "NO", "NO", "YES"), vitals_hr_100 = c("YES", 
"NO", "YES", "YES", "NO", "NO", "NO", "NO", "YES", "YES")), row.names = c(NA, 
-10L), class = "data.frame")
#Vector for match
dim_1 <- c("symp_ams","symp_nvd","symp_pain","vitals_gcs")

接下来是使用tidyverse函数的解决方案。我们重塑数据,处理每一行计算一个id。之后,我们检查条件,聚合值,最后将结果绑定到初始数据帧:

代码语言:javascript
复制
#Reshape
df_help %>% bind_cols(df_help %>% mutate(id=1:n()) %>%
                        pivot_longer(cols = -id) %>%
                        mutate(Num=ifelse(name %in% dim_1 & value=='YES',1,0)) %>%
                        group_by(id) %>% summarise(Dim1=sum(Num)) %>% select(-id))

输出:

代码语言:javascript
复制
   symp_ams symp_nvd symp_pain symp_fever vitals_gcs vitals_rr_10_24 vitals_temp_38 vitals_hr_100 Dim1
1        NO      YES        NO         NO         NO              NO             NO           YES    1
2        NO       NO        NO         NO         NO              NO             NO            NO    0
3       YES       NO        NO         NO        YES              NO            UNK           YES    2
4        NO       NO        NO         NO         NO              NO            UNK           YES    0
5        NO       NO        NO        YES        YES              NO            YES            NO    1
6        NO       NO        NO         NO         NO              NO             NO            NO    0
7        NO       NO        NO        YES         NO              NO             NO            NO    0
8        NO      YES        NO         NO         NO              NO             NO            NO    1
9       YES       NO        NO         NO        YES              NO             NO           YES    2
10       NO       NO        NO        YES         NO             YES            YES           YES    0

顺便提一下,在您的最终输出中,第5行应该有一个拼写错误,因为vitals_gcs列被定义为YES并与向量dim_1匹配。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63641423

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档