使用 R 在数据框中插入缺失数据

Interpolating missing data in a dataframe with R

我有一个类似于下面的数据框:

Country Ccode Year Happiness Power   
1  France    FR 2000      1000  1000  
2  France    FR 2001        NA    NA
3  France    FR 2002        NA    NA
4  France    FR 2003      1600  2200
5  France    FR 2004        NA    NA
6      UK    UK 2000      1000  1000  
7      UK    UK 2001        NA    NA
8      UK    UK 2002      1000  1000  
9      UK    UK 2003      1000  1000
10     UK    UK 2004      1000  1000 

我之前使用过以下代码来获取差异:

df <- df %>%
  arrange(country, year) %>%  #sort data
  group_by(country) %>%
  mutate_if(is.numeric, funs(d = . - lag(.)))

我想通过计算 HappinessPower 的数据点之间的差异来扩展此代码,将其除以数据点之间的年份差异并计算要替换的值NA 与,导致以下输出。

Country Ccode Year Happiness Power   
1  France    FR 2000      1000  1000  
2  France    FR 2001      1200  1400    
3  France    FR 2002      1400  1800
4  France    FR 2003      1600  2200
5  France    FR 2004        NA    NA
6      UK    UK 2000      1000  1000  
7      UK    UK 2001        0      0
8      UK    UK 2002      1000  1000  
9      UK    UK 2003      1000  1000
10     UK    UK 2004      1000  1000  

执行此任务的有效方法是什么?

编辑:请注意 France 2004 也是 NA。扩展功能似乎确实可以正确处理这种情况。

编辑 2:添加 group_by(国家/地区)似乎把事情搞砸了 reasons:It 似乎代码正在尝试将 character 转换为 [=19= 】,虽然我不太明白为什么。当我将列转换为 character 时,错误变成了评估错误。有什么建议吗?

> TRcomplete<-TRcomplete%>%
+     group_by(country) %>%
+     mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) : 
  Column `F116.s` can't be converted from character to numeric
> TRcomplete$F116.s <- as.numeric(TRcomplete$F116.s)
> TRcomplete<-TRcomplete%>%
+     group_by(country) %>%
+     mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) : 
  Column `F116.s` can't be converted from character to numeric
> TRcomplete$F116.s <- as.numeric(as.character(TRcomplete$F116.s))
> TRcomplete<-TRcomplete%>%
+     group_by(country) %>%
+     mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) : 
  Column `F116.s` can't be converted from character to numeric
> TRcomplete$F116.s <- as.character(TRcomplete$F116.s))
Error: unexpected ')' in "TRcomplete$F116.s <- as.character(TRcomplete$F116.s))"
> TRcomplete$F116.s <- as.character(TRcomplete$F116.s)
> str(TRcomplete$F116.s)
 chr [1:6984] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ...
> TRcomplete<-TRcomplete%>%
+     group_by(country) %>%
+     mutate_at(70:73,~na.fill(.x,"extend"))
Error in mutate_impl(.data, dots) : 
  Evaluation error: need at least two non-NA values to interpolate.

您可以将 na.fillzoo

中的 fill="extend" 一起使用
rapply(df, zoo::na.fill,"integer",fill="extend",how="replace")
  Country Ccode Year Happiness Power
1  France    FR 2000      1000  1000
2  France    FR 2001      1200  1400
3  France    FR 2003      1400  1800
4  France    FR 2004      1600  2200
5      UK    UK 2000      1000  1000
6      UK    UK 2001      1000  1000
7      UK    UK 2003      1000  1000
8      UK    UK 2004      1000  1000

编辑:

library(tidyverse)
library(zoo)
df%>%
  group_by(Country)%>%
  mutate_at(4:5,~na.fill(.x,"extend"))

  Country Ccode Year Happiness Power
1  France    FR 2000      1000  1000
2  France    FR 2001      1200  1400
3  France    FR 2003      1400  1800
4  France    FR 2004      1600  2200
5      UK    UK 2000      1000  1000
6      UK    UK 2001      1000  1000
7      UK    UK 2003      1000  1000
8      UK    UK 2004      1000  1000

如果组中的所有元素都是NA那么:

df%>% 
  group_by(Country)%>% 
  mutate_if(is.numeric,~if(all(is.na(.x))) NA else na.fill(.x,"extend"))