嗨,我有数据帧,如何替换"Val_1“中关于Val_2最近值的NA值。
对于ID -4值下的Val_1,Val_2的对应值为"33.3“,我们需要用Val_2中的最近值替换,即45 (前最近的值为45),而ID-8的值为33 (最近的值为44.6为44.5)。
ID Date Val_1 Val_2
1 01-02-2014 NA 22
2 02-02-2014 23 NA
3 03-02-2014 45 33
4 04-02-2014 NA 33.3
5 05-02-2014 45 46
6 06-02-2014 33 44.5
7 07-02-2014 56 48
8 08-02-2014 NA 44.6
9 09-02-2014 10 43
10 10-02-2014 14 56
11 11-02-2014 NA NA
12 12-02-2014 22 22我们可以将NA值替换为
library(zoo)
na.locf(na.locf(DF$Val_1), fromLast = TRUE)
but above code replace with previous value from the same column
o/p :
ID Date Val_1 Val_2
1 01-02-2014 NA 22
2 02-02-2014 23 NA
3 03-02-2014 45 33
4 04-02-2014 45 33.3
5 05-02-2014 45 46
6 06-02-2014 33 44.5
7 07-02-2014 56 48
8 08-02-2014 33 44.6
9 09-02-2014 10 43
10 10-02-2014 14 56
11 11-02-2014 NA NA
12 12-02-2014 22 22谢谢
发布于 2017-07-11 09:46:38
对不起,我想不出更简单的办法了:
# To use pipes
library(dplyr)
# Give a threshold. Nearest values must have a difference below this threshold
diff.threshold <- 0.5
# Create a vector with IDs that must have Val_1 updated
IDtoReplace <- DF %>% filter(is.na(Val_1), !is.na(Val_2)) %>%
select(ID) %>%
unlist()
for (id in IDtoReplace){
# Get Val_2 from current id
curVal2 <- DF %>% filter(ID==id) %>% select(Val_2) %>% unlist()
# Get value to be input
valuetoinput <- DF %>% filter(!is.na(Val_1),!is.na(Val_2),ID < id) %>% # Filter out all NA values and keep only previous ID
mutate(diff = abs(Val_2-curVal2)) %>% # Calculate all the differentes
filter(diff==min(diff),diff<=diff.threshold) %>% # Keep row with minimum difference (it has to be below the threshold)
select(Val_1) %>% # Select Val_1
unlist()
# If any value is found, replace it in the data frame
if(length(valuetoinput)>0)
DF[which(DF$ID==id),"Val_1"] <- valuetoinput
}因此:
> DF
ID Date Val_1 Val_2
1 1 01-02-2014 NA 22.0
2 2 02-02-2014 23 NA
3 3 03-02-2014 45 33.0
4 4 04-02-2014 45 33.3
5 5 05-02-2014 45 46.0
6 6 06-02-2014 33 44.5
7 7 07-02-2014 56 48.0
8 8 08-02-2014 33 44.6
9 9 09-02-2014 10 43.0
10 10 10-02-2014 14 56.0
11 11 11-02-2014 NA NA
12 12 12-02-2014 22 22.0你会经常使用类似的东西吗?如果是,我建议您将for循环重写为一个函数。
https://stackoverflow.com/questions/45025537
复制相似问题