na.approx R 中的插值
na.approx Interpolation in R
我正在使用 Zoo 的 na.approx 来填充 NA 值。
library(zoo)
Bus_data<-data.frame(Action = c("Boarding", "Alighting",NA, NA,"Boarding", "Alighting",NA, NA,"Boarding", "Alighting"),
Distance=c(1,1,2,2,3,3,4,4,5,5),
Time = c(1,2,NA,NA,5,6,NA,NA,9,10))
我希望结果 data.frame 如下所示:
Action Distance Time
1 Boarding 1 1
2 Alighting 1 2
3 NA 2 3.5
4 NA 2 3.5
5 Boarding 3 5
6 Alighting 3 6
7 NA 4 7.5
8 NA 4 7.5
9 Boarding 5 9
10 Alighting 5 10
然而,当我使用
na.approx(Bus_data$Time,Bus_data$Distance,ties = "ordered" )
1 Boarding 1 2 <-Value Changes
2 Alighting 1 2
3 NA 2 3.5
4 NA 2 3.5
5 Boarding 3 6 <-Value Changes
6 Alighting 3 6
7 NA 4 7.5
8 NA 4 7.5
9 Boarding 5 10 <-Value Changes
10 Alighting 5 10
知道如何通过 na.approx 获得想要的结果吗?请注意,在示例中 "Distance" 为简化起见均匀分布,数据集具有不同的距离。
我们可以 replace
原始列的非 NA 元素到 na.approx
之后的 NA
然后做一个 coalesce
library(dplyr)
library(zoo)
coalesce(Bus_data$Time, replace(na.approx(Bus_data$Time,Bus_data$Distance,
ties = "ordered" ),
!is.na(Bus_data$Time), NA))
#[1] 1.0 2.0 3.5 3.5 5.0 6.0 7.5 7.5 9.0 10.0
您可以使用来自 baseR
的 approx
Time = c(1,2,NA,NA,5,6,NA,NA,9,10)
approx(Time, method = "constant", n = length(Time), f = .5)$y
结果
# [1] 1.0 2.0 3.5 3.5 5.0 6.0 7.5 7.5 9.0 10.0
来自?approx
f :
for method = "constant" a number between 0 and 1 inclusive, indicating a compromise between left- and right-continuous step functions. If y0 and y1 are the values to the left and right of the point then the value is y0 if f == 0, y1 if f == 1, and y0*(1-f)+y1*f for intermediate values. In this way the result is right-continuous for f == 0 and left-continuous for f == 1, even for non-finite y values.
和na.approx
会很相似
library(zoo)
na.approx(Time, method = "constant", f = .5)
我正在使用 Zoo 的 na.approx 来填充 NA 值。
library(zoo)
Bus_data<-data.frame(Action = c("Boarding", "Alighting",NA, NA,"Boarding", "Alighting",NA, NA,"Boarding", "Alighting"),
Distance=c(1,1,2,2,3,3,4,4,5,5),
Time = c(1,2,NA,NA,5,6,NA,NA,9,10))
我希望结果 data.frame 如下所示:
Action Distance Time
1 Boarding 1 1
2 Alighting 1 2
3 NA 2 3.5
4 NA 2 3.5
5 Boarding 3 5
6 Alighting 3 6
7 NA 4 7.5
8 NA 4 7.5
9 Boarding 5 9
10 Alighting 5 10
然而,当我使用
na.approx(Bus_data$Time,Bus_data$Distance,ties = "ordered" )
1 Boarding 1 2 <-Value Changes
2 Alighting 1 2
3 NA 2 3.5
4 NA 2 3.5
5 Boarding 3 6 <-Value Changes
6 Alighting 3 6
7 NA 4 7.5
8 NA 4 7.5
9 Boarding 5 10 <-Value Changes
10 Alighting 5 10
知道如何通过 na.approx 获得想要的结果吗?请注意,在示例中 "Distance" 为简化起见均匀分布,数据集具有不同的距离。
我们可以 replace
原始列的非 NA 元素到 na.approx
之后的 NA
然后做一个 coalesce
library(dplyr)
library(zoo)
coalesce(Bus_data$Time, replace(na.approx(Bus_data$Time,Bus_data$Distance,
ties = "ordered" ),
!is.na(Bus_data$Time), NA))
#[1] 1.0 2.0 3.5 3.5 5.0 6.0 7.5 7.5 9.0 10.0
您可以使用来自 baseR
的approx
Time = c(1,2,NA,NA,5,6,NA,NA,9,10)
approx(Time, method = "constant", n = length(Time), f = .5)$y
结果
# [1] 1.0 2.0 3.5 3.5 5.0 6.0 7.5 7.5 9.0 10.0
来自?approx
f : for method = "constant" a number between 0 and 1 inclusive, indicating a compromise between left- and right-continuous step functions. If y0 and y1 are the values to the left and right of the point then the value is y0 if f == 0, y1 if f == 1, and y0*(1-f)+y1*f for intermediate values. In this way the result is right-continuous for f == 0 and left-continuous for f == 1, even for non-finite y values.
和na.approx
会很相似
library(zoo)
na.approx(Time, method = "constant", f = .5)