用不同的插值技术在时间序列数据中填充 NA
Filling NA in timeseries data with different interpolation techniques
Time = c("7/16/2017 18:46", "7/16/2017 21:52",
"7/16/2017 23:16", "7/17/2017 4:03", "7/17/2017 5:13", "7/17/2017 5:27",
"7/17/2017 18:57", "7/17/2017 19:25", "7/17/2017 23:58", "7/18/2017 2:59",
"7/18/2017 3:27", "7/18/2017 3:59")
Flux = c(NA, NA, 4.51263406,
NA, NA, 2.291454049, NA, 4.568703192, NA, NA, 3.392520428, NA
), int = c(403.5413091, 421.5796345, NA, 410.0796897, NA, NA,
363.5271212, NA, NA, 398.9564539, NA, NA)
corr = c(422.745436,
447.6726631, NA, 420.4392183, NA, NA, 408.7056493, NA, NA, 421.8799971,
NA, NA)
dat = c(NA, NA, NA, NA, 2.316481462, NA, NA, NA, 7.11779784,
NA, NA, 2.953349661)
df$Time <- as.POSIXct(strptime(df$Timestamp, format="%m/%d/%Y %H:%M"))
看起来像...
Time Flux int corr dat
7/16/2017 18:46 NA 403.5413091 422.745436 NA
7/16/2017 21:52 NA 421.5796345 447.6726631 NA
7/16/2017 23:16 4.51263406 NA NA NA
7/17/2017 4:03 NA 410.0796897 420.4392183 NA
7/17/2017 5:13 NA NA NA 2.316481462
7/17/2017 5:27 2.291454049 NA NA NA
7/17/2017 18:57 NA 363.5271212 408.7056493 NA
7/17/2017 19:25 4.568703192 NA NA NA
7/17/2017 23:58 NA NA NA 7.11779784
7/18/2017 2:59 NA 398.9564539 421.8799971 NA
7/18/2017 3:27 3.392520428 NA NA NA
7/18/2017 3:59 NA NA NA 2.953349661
我有四列(1个时间数据,3个连续数据)。我在每一列中都有很多 NA 值。我想为所有列插入并填充 NA。因为我不知道我需要哪种插值方法,所以我想要很多插值方法(线性、样条等)。我试过 na.approx 但没用。
有什么帮助吗?
df <- fill(df,direction = c (names(df)))
但我不知道它使用哪种技术来填充 NA
如果您想尝试并比较所述的几种插值方法,您可以使用 imputeTS
包中的 na.interpolation()
函数。
对于线性插值:
library("imputeTS")
na.interpolation(df, option = "linear")
对于样条插值:
library("imputeTS")
na.interpolation(df, option = "spline")
对于stineman插值:
library("imputeTS")
na.interpolation(df, option = "stine")
如您所见,您只需调整选项参数即可。
Time = c("7/16/2017 18:46", "7/16/2017 21:52",
"7/16/2017 23:16", "7/17/2017 4:03", "7/17/2017 5:13", "7/17/2017 5:27",
"7/17/2017 18:57", "7/17/2017 19:25", "7/17/2017 23:58", "7/18/2017 2:59",
"7/18/2017 3:27", "7/18/2017 3:59")
Flux = c(NA, NA, 4.51263406,
NA, NA, 2.291454049, NA, 4.568703192, NA, NA, 3.392520428, NA
), int = c(403.5413091, 421.5796345, NA, 410.0796897, NA, NA,
363.5271212, NA, NA, 398.9564539, NA, NA)
corr = c(422.745436,
447.6726631, NA, 420.4392183, NA, NA, 408.7056493, NA, NA, 421.8799971,
NA, NA)
dat = c(NA, NA, NA, NA, 2.316481462, NA, NA, NA, 7.11779784,
NA, NA, 2.953349661)
df$Time <- as.POSIXct(strptime(df$Timestamp, format="%m/%d/%Y %H:%M"))
看起来像...
Time Flux int corr dat
7/16/2017 18:46 NA 403.5413091 422.745436 NA
7/16/2017 21:52 NA 421.5796345 447.6726631 NA
7/16/2017 23:16 4.51263406 NA NA NA
7/17/2017 4:03 NA 410.0796897 420.4392183 NA
7/17/2017 5:13 NA NA NA 2.316481462
7/17/2017 5:27 2.291454049 NA NA NA
7/17/2017 18:57 NA 363.5271212 408.7056493 NA
7/17/2017 19:25 4.568703192 NA NA NA
7/17/2017 23:58 NA NA NA 7.11779784
7/18/2017 2:59 NA 398.9564539 421.8799971 NA
7/18/2017 3:27 3.392520428 NA NA NA
7/18/2017 3:59 NA NA NA 2.953349661
我有四列(1个时间数据,3个连续数据)。我在每一列中都有很多 NA 值。我想为所有列插入并填充 NA。因为我不知道我需要哪种插值方法,所以我想要很多插值方法(线性、样条等)。我试过 na.approx 但没用。
有什么帮助吗?
df <- fill(df,direction = c (names(df)))
但我不知道它使用哪种技术来填充 NA
如果您想尝试并比较所述的几种插值方法,您可以使用 imputeTS
包中的 na.interpolation()
函数。
对于线性插值:
library("imputeTS")
na.interpolation(df, option = "linear")
对于样条插值:
library("imputeTS")
na.interpolation(df, option = "spline")
对于stineman插值:
library("imputeTS")
na.interpolation(df, option = "stine")
如您所见,您只需调整选项参数即可。