如何在R中的数据框中插入行数较少的列

How to insert a column with less rows in a dataframe in R

例如,我有以下数据框:

Date           Price
2020-01-01      500
2020-01-02      550
2020-01-03      480
2020-01-04      420
2020-01-05      450
2020-01-06      390
2020-01-07      430
2020-01-08      480
2020-01-09      490
2020-01-10      485

并且我想在前一个数据框中插入 "Prince2" 列,以便我重复该行直到下一个类似日期:

Date           Price2
2020-01-01       50
2020-01-04       20
2020-01-06       90
2020-01-07       30
2020-01-10       80

会是这样的:

Date           Price     Price2
2020-01-01      500        50
2020-01-02      550        50
2020-01-03      480        50
2020-01-04      420        20
2020-01-05      450        20
2020-01-06      390        90
2020-01-07      430        30
2020-01-08      480        30
2020-01-09      490        30
2020-01-10      485        80

有人可以帮助我吗?

我们可以使用 mergezoo

中的 na.locf0
library(zoo)
transform(merge(df1, df2, by = 'Date',all.x = TRUE), Price2 = na.locf0(Price2))

data.table

library(data.table)
setDT(df1)[df2, Price2 := Price2, on = .(Date)][, Price2 := na.locf0(Price2)][]

tidyverse

library(dplyr)
library(tidyr)
left_join(df1, df2, by = 'Date') %>%
      fill(Price2)
#         Date Price Price2
#1  2020-01-01   500     50
#2  2020-01-02   550     50
#3  2020-01-03   480     50
#4  2020-01-04   420     20
#5  2020-01-05   450     20
#6  2020-01-06   390     90
#7  2020-01-07   430     30
#8  2020-01-08   480     30
#9  2020-01-09   490     30
#10 2020-01-10   485     80

或转换为 Date class 并执行 complete

df1$Date <- as.Date(df1$Date)
df2$Date <- as.Date(df2$Date)
df2 %>% 
   complete(Date = seq(min(df1$Date), max(df1$Date), by = '1 day')) %>%
   fill(Price2) %>% 
   left_join(df1)

数据

df1 <- structure(list(Date = c("2020-01-01", "2020-01-02", "2020-01-03", 
"2020-01-04", "2020-01-05", "2020-01-06", "2020-01-07", "2020-01-08", 
"2020-01-09", "2020-01-10"), Price = c(500L, 550L, 480L, 420L, 
450L, 390L, 430L, 480L, 490L, 485L)), class = "data.frame", row.names = c(NA, 
-10L))


df2 <- structure(list(Date = c("2020-01-01", "2020-01-04", "2020-01-06", 
"2020-01-07", "2020-01-10"), Price2 = c(50L, 20L, 90L, 30L, 80L
)), class = "data.frame", row.names = c(NA, -5L))