我刚刚开始学习R,它真的很有用,我正试图用它来计算覆盖的天数比例。这个指标与衡量一个人对药物的依从性有关。基本上,在给定的时间段内,您可以找到一种药物的所有填充物,通过填充日期和供应中的天数来确定它们覆盖的天数。例如,如果一个人在2016年2月1日获得了35天的保单,则他们的承保范围为2016年2月1日至2016年3月6日。很简单。
当他们在第一次填充的覆盖范围用完之前返回填充时,这会变得很棘手,你不会重复计算天数(例如,此人在2016年3月1日第二次填充,3/1-3/6只计算一次)。
我实际上写了一些看起来工作正常的代码,但它使用的是FOR循环,我了解到这在R中不能很好地工作,当我开始向它抛出一堆数据时,我很担心。
以下是构建测试数据和初始化一些变量的代码的第一部分:
#Create test data vectors
Person <- c(rep("Person1",12),rep("Person2",9))
FillDate <- c("2016-1-1", "2016-2-1", "2016-3-1", "2016-4-1", "2016-5-1", "2016-6-1", "2016-7-1", "2016-8-1", "2016-9-1", "2016-10-1", "2016-11-1", "2016-12-1", "2016-2-1", "2016-3-1", "2016-4-20", "2016-5-1", "2016-6-1", "2016-7-1", "2016-8-1", "2016-9-1", "2016-10-1")
DaysSupply <- c(rep("35", 14), "20", "5", "20", rep("35", 4))
#Build into data.frame
PDCTestData <- cbind.data.frame(as.factor(Person),as.Date(FillDate,"%Y-%m-%d"),as.numeric(DaysSupply))
colnames(PDCTestData) <- c("Person","FillDate","DaysSupply")
#Create start and end dates for overall period
StartDate <- as.Date("2016-01-01")
EndDate <- as.Date("2016-12-31")
#Initialize DaysCoveredList, a vector to hold the list of dates that a person has drug coverage
DaysCoveredList <- NULL
#Initialize DaysCoveredTable, a matrix to count the total number of unique days in the DaysCovered List, by person
DaysCoveredTable <- NULL第二部分完成实际工作:
#Begin looping through individuals
for(p in levels(PDCTestData$Person)){
#Begin looping through drug fills
for(DrugSpan in 1:nrow(PDCTestData[PDCTestData$Person == p,])){
#Create a sequence of the dates covered by that fill, the sequence starts on the fill date and runs for the number of days in Days Supply, Builds a list of all days covered for that person
DaysCoveredList <- c(DaysCoveredList,seq.Date(from = PDCTestData[PDCTestData$Person == p,][DrugSpan,]$FillDate, length.out = PDCTestData[PDCTestData$Person == p,][DrugSpan,]$DaysSupply, by = "day"))
} #Exit drug fill loop
#Counts the number of unique days covered from the DaysCovredList, with in the start and end of the overall period
DaysCovered <- length(unique(DaysCoveredList[DaysCoveredList >= StartDate & DaysCoveredList <= EndDate]))
#Adds the unique count from DaysCovered to the summary DaysCoveredTable
DaysCoveredTable <- rbind(DaysCoveredTable,cbind(p,DaysCovered))
#Clear DaysCovered and DaysCovredList
DaysCovered <- NULL
DaysCoveredList <- NULL
} #Exit the individual loop如果您能提供任何帮助,我们将不胜感激。
谢谢。
发布于 2017-03-22 07:35:37
library(lubridate)
ptd <- PDCTestData # I get bored writing long variable names
ptd$EndDate <- ptd$FillDate + ptd$DaysSupply
ptd$DrugInterval <- interval(ptd$FillDate, ptd$EndDate)
all_days <- as.Date(StartDate:EndDate, origin = "1970-01-01")
lapply(unique(ptd$Person), function (y) sum(sapply(all_days, function (x) any(x %within% ptd$DrugInterval[ptd$Person==y]))))不能保证速度,但可能更容易阅读。
https://stackoverflow.com/questions/42937624
复制相似问题