首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何通过检查多列中的多个值来对新变量进行有条件的编码

如何通过检查多列中的多个值来对新变量进行有条件的编码
EN

Stack Overflow用户
提问于 2019-08-24 12:11:46
回答 1查看 47关注 0票数 0

我想搜索一组变量(DX1-DX5)中的一系列值(ICD9代码列表或ICD10代码列表),条件是DXVER=9搜索ICD9列表,DXVER=10搜索ICD10列表。这是可以手工写出来的,但是因为变量和代码列表可能会更长,所以我想通过按名称引用组来学习这一点。

我已经尝试了if else语句、rowsum和编码每个步骤。这些是可以工作的,尽管我总是遗漏了三个要求中的一个。1.以DXVER为条件2. DX1-DX5列表参考DXS 3.代码列表参考ICD9和ICD10

代码语言:javascript
复制
ICD9 <- c('042', '7953', '79571', 'V08')
ICD10 <- c('888', 'a10', 'b10', '987')

DXS <- c(paste('DX', seq(1:5), sep = '')) 

MergedData<-data.frame(DXVER=c("9", "9", "0", "0", "9"),
                   DX1 = c('042',"1","1","2","1"),
                   DX2 = c("4","3",'2',"4","3"),
                   DX3 = c("2","2","4","2","2"),
                   DX4 = c("3",'101',"2","3","2"),
                   DX5 = c('79571',"3",'888',"2","2"),
                   ID  = c(345, 123, 246, 432, 121))

MergedData$HIV_claim<NA
MergedData$HIV_claim[MergedData$DX1 %in% ICD9 & MergedData$DXVER=="9"] <- 
1 
MergedData$HIV_claim[MergedData$DX2 %in% ICD9 & MergedData$DXVER=="9"] <- 
1 
MergedData$HIV_claim[MergedData$DX3 %in% ICD9 & MergedData$DXVER=="9"] <- 
1 
MergedData$HIV_claim[MergedData$DX4 %in% ICD9 & MergedData$DXVER=="9"] <- 
1 
MergedData$HIV_claim[MergedData$DX5 %in% ICD9 & MergedData$DXVER=="9"] <- 
1 

MergedData$HIV_claim[MergedData$DX1 %in% ICD10 & MergedData$DXVER=="0"] <- 
1 
MergedData$HIV_claim[MergedData$DX2 %in% ICD10 & MergedData$DXVER=="0"] <- 
1 
MergedData$HIV_claim[MergedData$DX3 %in% ICD10 & MergedData$DXVER=="0"] <- 
1    
MergedData$HIV_claim[MergedData$DX4 %in% ICD10 & MergedData$DXVER=="0"] <- 
1 
MergedData$HIV_claim[MergedData$DX5 %in% ICD10 & MergedData$DXVER=="0"] <- 
1 


 MergedData2 <- MergedData

 MergedData2$HIV_claim9<NA
MergedData2$HIV_claim10<NA
MergedData2$HIV_claim9 <-  ((rowSums(MergedData2[, DXS] == ICD9,  na.rm=T) 
> 0) * 1)
MergedData2$HIV_claim10 <- ((rowSums(MergedData2[, DXS] == ICD10, na.rm=T) 
> 0) * 1)

MergedData2$HIV_claim  <-  ifelse(MergedData2$DXVER == 9, 
ifelse(MergedData2$DX1 %in% ICD9 |  MergedData2$DX2 %in% ICD9
 |  MergedData2$DX3 %in% ICD9 |  MergedData2$DX4 %in% ICD9
|  MergedData2$DX5 %in% ICD9,1,0),0)
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-08-24 12:54:24

代码语言:javascript
复制
library(dplyr)

md %>% gather(dx,v,-DXVER,-ID) %>% 
  mutate(hiv = ifelse((v %in% ICD9 & DXVER=='9') | (v %in% ICD10 & DXVER=='0'),1,NA)) %>% 
  filter(hiv==1) %>% group_by(ID) %>% slice(1) %>% 
  select(ID,hiv) %>% left_join(md,.)

Joining, by = "ID"
  DXVER DX1 DX2 DX3 DX4   DX5  ID hiv
1     9 042   4   2   3 79571 345   1
2     9   1   3   2 101     3 123  NA
3     0   1   2   4   2   888 246   1
4     0   2   4   2   3     2 432  NA
5     9   1   3   2   2     2 121  NA
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/57635022

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档