我想根据提供的ICD-10代码字典,将一行中的ICD-10代码更改为疾病的名称。
这是初始数据
id <- c("1","2","3")
Dx1 <- c("E119", "I251","I20")
Dx2 <- c("I20", "I251","E119")
Dx3 <- c("I251", "E119","I20")
df <- data.frame(id,Dx1,Dx2,Dx3)
df这是ICD-10代码字典,在本例中有3个代码,但实际上,ICD-10代码包含9万4千个代码。
ICD <- c("I251", "E119","I20")
Disease <- c("Acute Myocard Infarct", "Type 2 Diabetes", "Chest Pain")
CodeDictionary <- data.frame(ICD,Disease)
CodeDictionary这是我的目标
id <- c("1","2","3")
Dx1 <- c("Type 2 Diabetes", "Acute Myocard Infarct","Chest Pain")
Dx2 <- c("Chest Pain", "Acute Myocard Infarct","Type 2 Diabetes")
Dx3 <- c("Acute Myocard Infarct", "Type 2 Diabetes","I20")
dfGoal <- data.frame(id,Dx1,Dx2,Dx3)
dfGoal我尝试了dplyr中的内连接,但它不起作用。谢谢你的帮助!
发布于 2020-06-30 00:58:20
您可以使用plyr包通过函数mapvalues(...)来完成此操作。但是,您必须遍历列,这并不理想。
library(plyr)
id <- c("1","2","3")
Dx1 <- c("E119", "I251","I20")
Dx2 <- c("I20", "I251","E119")
Dx3 <- c("I251", "E119","I20")
df <- data.frame(id,Dx1,Dx2,Dx3)
dfGoal <- df
for(i in c(2:dim(df)[2])){
dfGoal[,i] <- mapvalues(df[,i],
from = c("I251", "E119","I20"),
to = c("Acute Myocard Infarct", "Type 2 Diabetes", "Chest Pain"))
}
dfGoal在这个线程中,您可能会发现更多有用的技术:https://stackoverflow.com/a/25790005/7120715
发布于 2020-06-30 00:33:20
我认为inner_join是正确的方向。在此之前,您需要使用pivot_longer,然后一旦您有了疾病名称,您就可以使用pivot_wider来获取您的dfGoal:
library(tidyverse)
id <- c("1","2","3")
Dx1 <- c("E119", "I251","I20")
Dx2 <- c("I20", "I251","E119")
Dx3 <- c("I251", "E119","I20")
df <- data.frame(id,Dx1,Dx2,Dx3)
df
ICD <- c("I251", "E119","I20")
Disease <- c("Acute Myocard Infarct", "Type 2 Diabetes", "Chest Pain")
CodeDictionary <- data.frame(ICD,Disease)
CodeDictionary
id <- c("1","2","3")
Dx1 <- c("Type 2 Diabetes", "Acute Myocard Infarct","Chest Pain")
Dx2 <- c("Chest Pain", "Acute Myocard Infarct","Type 2 Diabetes")
Dx3 <- c("Acute Myocard Infarct", "Type 2 Diabetes","I20")
dfGoal <- data.frame(id,Dx1,Dx2,Dx3)
dfGoal
df_goal <- df %>%
pivot_longer(-id, names_to = "diagnoses", values_to = "ICD") %>%
inner_join(CodeDictionary, by = "ICD") %>%
select(id, diagnoses, Disease) %>%
pivot_wider(names_from = diagnoses, values_from = Disease)希望这能有所帮助!
https://stackoverflow.com/questions/62642081
复制相似问题