文章/答案/技术大牛

发布

社区首页 >问答首页 >将虚拟变量转换为R中的单列

问将虚拟变量转换为R中的单列
EN

Stack Overflow用户

提问于 2022-01-03 17:33:15

回答 1查看 227关注 0票数 4

我在R中有下表，其中列出了一个人的种族、性别、年龄和胆固醇测试。年龄和胆固醇测试显示为虚拟变量。年龄可分为低、中或高，而胆固醇测试可分为低或高。我想把年龄和胆固醇柱转换成单一的列，其中低的分类为1，中等的为2，高的为3。如果一个人从未服用过胆固醇，并且在预期的产出中应该是N/A，那么胆固醇测试可以是低的或高的。我希望解决方案是动态的，这样如果我有这种格式的多列，代码仍然可以工作(例如，可能会有一些新的测试，这些测试可以被归类为高、低或中等的虚拟变量)。

我怎样才能在R中做到这一点？

投入：

  race  gender age.low_tm1 age.medium_tm1 age.high_tm1 chol_test.low_tm1 chol_test.high_tm1
  <chr>  <int>       <int>          <int>        <int>             <int>              <int>
1 white      0           1              0            0                 0                  0
2 white      0           1              0            0                 0                  0
3 white      1           1              0            0                 0                  0
4 black      1           0              1            0                 0                  0
5 white      0           0              0            1                 0                  1
6 black      0           0              1            0                 1                  0

预期产出：

  race  gender   age  chol_test
1 white      0     1        n/a  
2 white      0     1        n/a
3 white      1     1        n/a
4 black      1     2        n/a
5 white      0     3          3
6 black      0     2          1

dplyr

回答 1

Stack Overflow用户

发布于 2022-01-03 17:42:16

也许这能帮上忙

library(dplyr)
library(tidyr)
library(stringr)
df1 %>% 
   mutate(across(contains("_"),  ~  
   . * setNames(1:3, c("low", "medium", "high"))[
     str_extract(cur_column(), "low|medium|high")]))   %>%    
  rename_with(~ str_remove(., "_tm1")) %>% 
  pivot_longer(cols = -c(race, gender), 
    names_to = c(".value", "categ"), names_sep = "\\.") %>% 
  filter(age > 0|chol_test > 0) %>% 
  select(-categ) %>% 
  mutate(chol_test = na_if(chol_test, 0))

-output

# A tibble: 7 × 4
  race  gender   age chol_test
  <chr>  <int> <int>     <int>
1 white      0     1        NA
2 white      0     1        NA
3 white      1     1        NA
4 black      1     2        NA
5 white      0     3         3
6 black      0     0         1
7 black      0     2        NA

数据

df1 <- structure(list(race = c("white", "white", "white", "black", "white", 
"black"), gender = c(0L, 0L, 1L, 1L, 0L, 0L), age.low_tm1 = c(1L, 
1L, 1L, 0L, 0L, 0L), age.medium_tm1 = c(0L, 0L, 0L, 1L, 0L, 1L
), age.high_tm1 = c(0L, 0L, 0L, 0L, 1L, 0L), chol_test.low_tm1 = c(0L, 
0L, 0L, 0L, 0L, 1L), chol_test.high_tm1 = c(0L, 0L, 0L, 0L, 1L, 
0L)), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5", "6"))

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70569460

复制

相似问题

问将虚拟变量转换为R中的单列
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将虚拟变量转换为R中的单列EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将虚拟变量转换为R中的单列
EN