检查连续日期内的两个值是否相同

Check if two values within consecutive dates are identical

假设我有一个小问题

df <- tribble(
  ~date,       ~place, ~wthr,
  #------------/-----/--------
  "2017-05-06","NY","sun",
  "2017-05-06","CA","cloud",
  "2017-05-07","NY","sun",
  "2017-05-07","CA","rain",
  "2017-05-08","NY","cloud",
  "2017-05-08","CA","rain",
  "2017-05-09","NY","cloud",
  "2017-05-09","CA",NA,
  "2017-05-10","NY","cloud",
  "2017-05-10","CA","rain"
)

我想检查特定地区特定日期的天气是否与昨天相同,并将布尔列附加到 df,以便

tribble(
  ~date,       ~place, ~wthr, ~same,
  #------------/-----/------/------
  "2017-05-06","NY","sun",    NA,
  "2017-05-06","CA","cloud",  NA, 
  "2017-05-07","NY","sun",    TRUE,
  "2017-05-07","CA","rain",   FALSE,
  "2017-05-08","NY","cloud",  FALSE,
  "2017-05-08","CA","rain",   TRUE,
  "2017-05-09","NY","cloud",  TRUE,
  "2017-05-09","CA", NA,      NA,
  "2017-05-10","NY","cloud",  TRUE,
  "2017-05-10","CA","rain",   NA
)

有什么好的方法吗?

要获得逻辑列,请在按 place 分组后使用 lag 之前检查 wthr 值是否等于行。我为日期添加了 arrange 以确保按时间顺序排列。

library(dplyr)

df %>%
  arrange(date) %>%
  group_by(place) %>%
  mutate(same = wthr == lag(wthr, default = NA))

编辑:如果您想确保日期是连续的(相隔 1 天),您可以包含一个 ifelse 以查看 datelag(date)。如果相隔不是1天,可以编码为NA.

注意:另外,请确保您的日期是 Date:

df$date <- as.Date(df$date)

df %>%
  arrange(date) %>%
  group_by(place) %>%
  mutate(same = ifelse(
    date - lag(date) == 1, 
    wthr == lag(wthr, default = NA),
    NA))

输出

   date       place wthr  same 
   <chr>      <chr> <chr> <lgl>
 1 2017-05-06 NY    sun   NA   
 2 2017-05-06 CA    cloud NA   
 3 2017-05-07 NY    sun   TRUE 
 4 2017-05-07 CA    rain  FALSE
 5 2017-05-08 NY    cloud FALSE
 6 2017-05-08 CA    rain  TRUE 
 7 2017-05-09 NY    cloud TRUE 
 8 2017-05-09 CA    NA    NA   
 9 2017-05-10 NY    cloud TRUE 
10 2017-05-10 CA    rain  NA