嘿,我需要填充数据框的缺失值,然后才能在闪亮的应用程序上运行它们。规则是使用整个数据框的列K-1中的值来填充列K中的缺失值。
实际上我已经想好了怎么做,但我认为我的方法太复杂了。我相信应该有更简单的方法来做。我在这里附加了数据、代码和输出。如果你有更简单的方法,请告诉我。
非常感谢。
data2 = data.frame('population by age' = seq(3, 24, by = 1),
'2008' = c(145391,
140621,
136150,
131944,
127968,
124209,
120650,
117163,
113674,
110207,
106871,
103659,
100398,
97017,
93584,
90240,
86957,
83783,
80756,
77850,
75003,
72226
),
'2009' = c(148566,
143943,
139367,
135083,
131052,
NA,
123628,
120213,
116826,
113381,
109915,
106574,
103346,
100058,
96644,
93175,
NA,
86455,
NA,
80192,
77279,
74422
),
'2010' = c(152330,
147261,
142555,
138172,
134071,
130214,
126559,
123099,
119825,
116538,
113134,
109669,
106320,
103075,
99760,
96312,
92805,
NA,
NA,
82733,
79661,
76739
),
'2011' = c(156630,
151387,
146491,
141905,
137593,
133545,
129737,
126124,
122678,
NA,
116093,
112666,
109174,
105791,
102505,
99159,
95699,
92193,
88759,
85373,
82123,
79065
))
data7 <- data2 %>%
gather(key = year, value = value, -`population.by.age` )%>%
group_by(`population.by.age`) %>%
nest
library(imputeTS)
impute_nas <- function(df, var, fun, ...) {
df[[var]] <- fun(df[[var]], ...)
return(df)
}
imputed <- data7 %>%
mutate(
interpolation = purrr::map(data, impute_nas, var = 'value', fun = imputeTS::na.locf)
) %>%
select(-data) %>%
unnest
imputed <- imputed %>% spread(key = 'year', value = 'value')
as.data.frame(imputed)最好的
发布于 2018-06-20 02:16:28
我可以想出一个使用for循环的快速解决方案如下所示。当然,不能归因于第一列。
impute_from_previous <- function(ds) {
for (i in 2:length(colnames(ds))) {
rows_missing <- which(is.na(ds[[i]]))
ds[rows_missing, i] <- ds[rows_missing, i - 1]
}
return(ds)
}
data3 <- impute_from_previous(data2)发布于 2018-06-20 02:22:18
一种选择是使用zoo::na.locf的功能用最后一个可用值填充NA。apply函数可以按行传递数据,zoo::na.locf将在这些数据上填充缺少的值。
library(zoo)
cbind(data2[1], t(apply(data2[2:5], 1, zoo::na.locf)))
# population.by.age X2008 X2009 X2010 X2011
# 1 3 145391 148566 152330 156630
# 2 4 140621 143943 147261 151387
# 3 5 136150 139367 142555 146491
# 4 6 131944 135083 138172 141905
# 5 7 127968 131052 134071 137593
# 6 8 124209 124209 130214 133545
# 7 9 120650 123628 126559 129737
# 8 10 117163 120213 123099 126124
# 9 11 113674 116826 119825 122678
# 10 12 110207 113381 116538 116538
# 11 13 106871 109915 113134 116093
# 12 14 103659 106574 109669 112666
# 13 15 100398 103346 106320 109174
# 14 16 97017 100058 103075 105791
# 15 17 93584 96644 99760 102505
# 16 18 90240 93175 96312 99159
# 17 19 86957 86957 92805 95699
# 18 20 83783 86455 86455 92193
# 19 21 80756 80756 80756 88759
# 20 22 77850 80192 82733 85373
# 21 23 75003 77279 79661 82123
# 22 24 72226 74422 76739 79065https://stackoverflow.com/questions/50934400
复制相似问题