首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用线性近似对NA观测值的估计

用线性近似对NA观测值的估计
EN

Stack Overflow用户
提问于 2015-05-11 12:28:42
回答 2查看 254关注 0票数 3

我想在数组的开始计算NA观测值,使用以下两个非NA观测值的线性近似来推断缺失值。然后对数组末尾的NA观测做同样的操作,使用前面的两个非NA观测。

我的df的一个可复制的例子:

代码语言:javascript
复制
M=matrix(sample(1:9,10*10,T),10);M[sample(1:length(M),0.5*length(M),F)]=NA;dimnames(M)=list(paste(rep("City",dim(M)[1]),1:dim(M)[1],sep=""),paste(rep("Year",dim(M)[2]),1:dim(M)[2],sep=""))
    M

       Year1 Year2 Year3 Year4 Year5 Year6 Year7 Year8 Year9 Year10
City1     NA     4     5    NA     3    NA    NA    NA     5     NA
City2      6    NA     3     3    NA     4     6    NA    NA      7
City3     NA     7    NA     8     8    NA    NA     8    NA      5
City4      3     5     3    NA    NA     3     5     9     8      7
City5      4     6     6    NA    NA     8    NA     7     1     NA
City6     NA    NA    NA    NA     4    NA     8     3     6      7
City7      9     3    NA    NA    NA    NA    NA     4    NA     NA
City8      5     6     9     8     5    NA    NA     1     4     NA
City9     NA    NA     6    NA     3     3     8    NA     7     NA
City10    NA    NA    NA    NA    NA    NA    NA    NA    NA      1

idx=rowSums(!is.na(M))>=2 # Index of rows with 2 or more non-NA to run na.approx

library(zoo)
M[idx,]=t(na.approx(t(M[idx,]),rule=1,method="linear")) # I'm using t as na.approx works on columns

       Year1 Year2 Year3 Year4    Year5 Year6 Year7 Year8 Year9 Year10
City1     NA   4.0     5   4.0 3.000000  3.50   4.0   4.5     5     NA
City2    6.0   5.5     3   3.0 5.500000  4.00   6.0   6.0     6      7
City3    4.5   7.0     3   8.0 8.000000  3.50   5.5   8.0     7      5
City4    3.0   5.0     3   8.0 6.666667  3.00   5.0   9.0     8      7
City5    4.0   6.0     6   8.0 5.333333  8.00   6.5   7.0     1      7
City6    6.5   4.5     7   8.0 4.000000  6.75   8.0   3.0     6      7
City7    9.0   3.0     8   8.0 4.500000  5.50   8.0   4.0     5     NA
City8    5.0   6.0     9   8.0 5.000000  4.25   8.0   1.0     4     NA
City9     NA    NA     6   4.5 3.000000  3.00   8.0   7.5     7     NA
City10    NA    NA    NA    NA       NA    NA    NA    NA    NA      1

我想根据前面的两个/下面的观察,用线性近似来推断边界(对于City1City9)。例如,M[1,1]应该是3M[1,10]应该是5,5

你知道我该怎么做吗?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2015-05-11 13:22:46

这给出了第一列,其中填充了NA的线性外推值。您可以适应最后一篇专栏文章。

代码语言:javascript
复制
firstNAfill <- function(x) {
  ans <- ifelse(!is.na(x[1]),
                x[1],
                ifelse(sum(!is.na(x))<2, NA,
                       2*x[which(!is.na(x[1, ]))[1]] - x[which(!is.na(x[1, ]))[2]]
                )
  )
  return(ans)
}


dat$Year1 <- unlist(lapply(seq(1:nrow(dat)), function(x) {firstNAfill(dat[x, ])}))

结果

代码语言:javascript
复制
       Year1 Year2 Year3 Year4    Year5 Year6 Year7 Year8 Year9 Year10
City1    3.0   4.0     5   4.0 3.000000  3.50   4.0   4.5     5     NA
City2    6.0   5.5     3   3.0 5.500000  4.00   6.0   6.0     6      7
City3    4.5   7.0     3   8.0 8.000000  3.50   5.5   8.0     7      5
City4    3.0   5.0     3   8.0 6.666667  3.00   5.0   9.0     8      7
City5    4.0   6.0     6   8.0 5.333333  8.00   6.5   7.0     1      7
City6    6.5   4.5     7   8.0 4.000000  6.75   8.0   3.0     6      7
City7    9.0   3.0     8   8.0 4.500000  5.50   8.0   4.0     5     NA
City8    5.0   6.0     9   8.0 5.000000  4.25   8.0   1.0     4     NA
City9    7.5    NA     6   4.5 3.000000  3.00   8.0   7.5     7     NA
City10    NA    NA    NA    NA       NA    NA    NA    NA    NA      1

该函数返回第一列的当前值(如果不是NA )、如果没有两个值可外推则返回NA,否则返回外推值。

票数 1
EN

Stack Overflow用户

发布于 2015-05-11 16:50:27

extrap中,nlead是输入向量x中的前导NAs数。non.na是不属于NA的x元素的子集。如果没有引导NA元素,或者如果有少于2个非NA元素,则返回输入。m是前两个非NAs的斜率.用外推替换x的第一个x元素。最后,我们使用extrap对每一行M应用MM[] <-,从而保留列名,然后反转每一行,重复和反向返回:

代码语言:javascript
复制
library(zoo)

extrap <- function(x) {
    nlead <- which.min(x * 0) - 1
    non.na <- na.omit(x)
    if (length(nlead) == 0 || nlead == 0) || length(non.na) < 2) return(x)
    m <- diff(head(non.na, 2))      
    replace(x, seq_len(nlead), non.na[1] - nlead:1 * m)
}

nc <- ncol(M)

naApprox <- function(x) if (length(na.omit(x)) < 2) x else na.approx(x, na.rm = FALSE)
MM <- M
MM[] <- t(apply(MM, 1, naApprox))

MM[] <- t(apply(MM, 1, extrap)) # extraploate to fill leading NAs
MM[] <- t(apply(MM[, nc:1], 1, extrap))[, nc:1] # extrapolate to fill trailing NAs

给予:

代码语言:javascript
复制
> MM
       Year1 Year2    Year3    Year4    Year5    Year6    Year7    Year8    Year9    Year10
City1    3.0   4.0 5.000000 4.000000 3.000000 3.500000 4.000000 4.500000 5.000000  5.500000
City2    6.0   4.5 3.000000 3.000000 3.500000 4.000000 6.000000 6.333333 6.666667  7.000000
City3    6.5   7.0 7.500000 8.000000 8.000000 8.000000 8.000000 8.000000 6.500000  5.000000
City4    3.0   5.0 3.000000 3.000000 3.000000 3.000000 5.000000 9.000000 8.000000  7.000000
City5    4.0   6.0 6.000000 6.666667 7.333333 8.000000 7.500000 7.000000 1.000000 -5.000000
City6   -4.0  -2.0 0.000000 2.000000 4.000000 6.000000 8.000000 3.000000 6.000000  7.000000
City7    9.0   3.0 3.166667 3.333333 3.500000 3.666667 3.833333 4.000000 4.166667  4.333333
City8    5.0   6.0 9.000000 8.000000 5.000000 3.666667 2.333333 1.000000 4.000000  7.000000
City9    9.0   7.5 6.000000 4.500000 3.000000 3.000000 8.000000 7.500000 7.000000  6.500000
City10    NA    NA       NA       NA       NA       NA       NA       NA       NA  1.000000

Note我们用它作为M

代码语言:javascript
复制
M <- structure(c(NA, 6L, NA, 3L, 4L, NA, 9L, 5L, NA, NA, 4L, NA, 7L, 
5L, 6L, NA, 3L, 6L, NA, NA, 5L, 3L, NA, 3L, 6L, NA, NA, 9L, 6L, 
NA, NA, 3L, 8L, NA, NA, NA, NA, 8L, NA, NA, 3L, NA, 8L, NA, NA, 
4L, NA, 5L, 3L, NA, NA, 4L, NA, 3L, 8L, NA, NA, NA, 3L, NA, NA, 
6L, NA, 5L, NA, 8L, NA, NA, 8L, NA, NA, NA, 8L, 9L, 7L, 3L, 4L, 
1L, NA, NA, 5L, NA, NA, 8L, 1L, 6L, NA, 4L, 7L, NA, NA, 7L, 5L, 
7L, NA, 7L, NA, NA, NA, 1L), .Dim = c(10L, 10L), .Dimnames = list(
    c("City1", "City2", "City3", "City4", "City5", "City6", "City7", 
    "City8", "City9", "City10"), c("Year1", "Year2", "Year3", 
    "Year4", "Year5", "Year6", "Year7", "Year8", "Year9", "Year10"
    )))

更新:修复。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/30167674

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档