创建具有相同数量值的列表

Creating list with the same number of values

我有一个包含日期、ID 和坐标的数据集,我想将其拆分为季节性月份。例如,对于冬天,我有 1 月至 winter1、2 月至 winter2 和 3 月至 winter3。我在夏季也做了同样的事情。

我想过滤掉包含所有这些月份的 ID,这样当我按 ID 和年份拆分数据时,列表长度就会相同。

我不确定如何在下面的示例代码中模拟每个 ID 的不均匀值,但在我的实际数据中,一些 ID 只有 summer1 而没有 winter1,虽然它可能是翻转 summer2 和 winter2`.

library(lubridate)
library(tidyverse)

date <- rep_len(seq(dmy("01-01-2010"), dmy("31-12-2013"), by = "days"),1000)
ID <- rep(seq(1, 5), 100)

df <- data.frame(date = date,
                 x = runif(length(date), min = 60000, max = 80000),
                 y = runif(length(date), min = 800000, max = 900000),
                 ID)

df$month <- month(df$date)
df$year <- year(df$date)

df1 <- df %>%
  mutate(season_categ = case_when(month %in% 6 ~ 'summer1',
                                  month %in% 7 ~ 'summer2',
                                  month %in% 8 ~ 'summer3',
                                  month %in% 1 ~ 'winter1',
                                  month %in% 2 ~ 'winter2',
                                  month %in% 3 ~ 'winter3')) %>%
  group_by(year, ID )%>% 
  filter(any(month %in% 6:8) &
           any(month %in% 1:3))

summer_list <- df1 %>% 
  filter(season_categ == "summer1") %>% 
  group_split(year, ID)

# Renames the names in the list to AnimalID and year
names(summer_list) <- sapply(summer_list, 
                             function(x) paste(x$ID[1], 
                                               x$year[1], sep = '_'))

# Creates a list for each year and by ID
winter_list <- df1 %>% 
  filter(season_categ == "winter1") %>% 
  group_split(year, ID)

names(winter_list) <- sapply(winter_list, 
                             function(x) paste(x$ID[1], 
                                               x$year[1], sep = '_'))


对我来说,你在寻找什么并不是很清楚。在将数据拆分为列表之前,按列对行进行排序

df1<-df1[order(ID,season_categ),]

### Determine which ID's have uneven numbers ###
df1 %>%
group_by(ID) %>%
summarize(month_seq = paste(season_categ , collapse = "_"),
          number_of_months = n(season_categ))

#### Remove odd numbers###

不确定这是否是您想要的,但我知道您希望删除任何年份中 Q1 和 Q3 少于 6 个月的 ID,但您可以修改过滤器或如果该假设错误则分组。

这是一种方法:

library(lubridate)
library(dplyr)
set.seed(12345)

# random sampling of dates with this seed gives no July date for ID 2 in 2010
df <- tibble(
  date = sample(seq(dmy("01-01-2010"), dmy("31-12-2013"), by = "days"), 
  1000, replace = TRUE), 
  x = runif(length(date), min = 60000, max = 80000),
  y = runif(length(date), min = 800000, max = 900000),
  ID = rep(1:5, 200),
  month = month(date),
  year  =year(date)) %>% 
  arrange(ID, date)

df %>%
  filter(month %in% c(1:3, 6:8)) %>% 
  group_by(ID, year) %>% 
  mutate(complete = length(unique(month)) == 6) %>%
  group_by(ID) %>% 
  filter(all(complete)) %>%
  group_by(ID, year) %>% 
  group_split()