基于另一个列表删除列表中的额外元素

Question

我有一个数据集，我试图将其分成两个列表。在每个列表中，它包含一个元素（例如，列表对象中的 [[1]]、[[2]]、[[3]]）用于 10 天间隔内的单个 ID（例如，第 1 天） [[1]] 中的 -10，[[2]] 中的 11-21，[[3]] 中的 22-31）。

在下面的示例代码中，jan 的列表中每个 ID 都有三个区间（例如，A 有三个区间的三个元素，B具有三个区间的三个元素，C 具有三个区间的三个元素）。 july 的列表每个 ID 只有 2 个间隔，这对我来说是个问题（例如，它在列表对象中只包含 [[1]] 和 [[2]] 而不是三个).

我想弄清楚如何删除 jan 中与 july 的间隔不对应的额外间隔。例如，对于 ID A 我想创建一个函数来比较两个列表，并删除 jan 中的第三个区间（july 中缺失的区间）。我该怎么做呢？

library(lubridate)
library(tidyverse)
date <- rep_len(seq(dmy("01-01-2010"), dmy("20-07-2010"), by = "days"), 600)
ID <- rep(c("A","B","C"), 200)

df <- data.frame(date = date,
                 x = runif(length(date), min = 60000, max = 80000),
                 y = runif(length(date), min = 800000, max = 900000),
                 ID)

df$month <- month(df$date)

jan <- df %>%
  mutate(new = floor_date(date, "10 days")) %>%
  group_by(ID) %>% 
  mutate(new = if_else(day(new) == 31, new - days(10), new)) %>% 
  group_by(new, .add = TRUE) %>%
  filter(month == "1") %>% 
  group_split()

july <- df %>%
  mutate(new = floor_date(date, "10 days")) %>%
  group_by(ID) %>% 
  mutate(new = if_else(day(new) == 31, new - days(10), new)) %>% 
  group_by(new, .add = TRUE) %>%
  filter(month == "7") %>% 
  group_split()

Answer 1

我仍然不确定你到底在追求什么。无论如何，这段代码可以满足您的要求。

df2 <- bind_rows(jan, july) %>%
  # adding a helper variable to distinguish if a day from the date component is
  # 10 or lower, 20 or lower or the rest 
  mutate(helper = ceiling(day(date)/10) %>% pmin(3)) %>% 
  group_by(ID, helper) %>%
  # adding another helper finding out how may distinct months there are in the subgroup
  mutate(helper2 = n_distinct(month)) %>% ungroup() %>%
  filter(helper2 == 2) %>%
  # getting rid of the helpers
  select(-helper, -helper2) %>%
  group_by(ID, new)

jan2 <- df2 %>%
  filter(month == "1") %>% 
  group_split()

基于另一个列表删除列表中的额外元素

Removing extra elements in a list based on another list

r

list

lubridate

dplyr