使用 NA 值查找 POSIXct 日期的最小值或最大值
Finding the min or max of POSIXct date with NA values
下面的数据包含个人 ID 列(重复观察),Date
和 Fate
。
ID Date Fate
1 BHS_1149 2017-04-11 MIA
2 BHS_1154 <NA> <NA>
3 BHS_1155 <NA> <NA>
4 BHS_1156 <NA> <NA>
5 BHS_1157 <NA> Mort
6 BHS_1159 2017-04-11 Alive
7 BHS_1169 2017-04-11 Alive
8 BHS_1259 <NA> <NA>
9 BHS_1260 <NA> <NA>
10 BHS_1262 2017-04-11 MIA
11 BHS_1262 2017-07-05 Alive
12 BHS_1262 2017-12-06 Alive
13 BHS_1262 2017-12-06 MIA
14 BHS_1262 2018-01-17 Mort
对于每个 ID,我想创建一个新列来表示当 Fate
处于活动状态时的最小值 Date
或最大值 Date
。如果在下面的代码中包含和排除 na.rm = T
参数,我尝试了不同的组合,但仍然收到以下警告。
library(tidyverse)
library(lubridate)
dat %>%
group_by(ID) %>%
mutate(
#the first or min of Date
FstSurvey = min(Date),
LstAlive = max(Date[Fate == "Alive"])) %>%
as.data.frame()
ID Date Fate FstSurvey LstAlive
1 BHS_1149 2017-04-11 MIA 2017-04-11 <NA>
2 BHS_1154 <NA> <NA> <NA> <NA>
3 BHS_1155 <NA> <NA> <NA> <NA>
4 BHS_1156 <NA> <NA> <NA> <NA>
5 BHS_1157 <NA> Mort <NA> <NA>
6 BHS_1159 2017-04-11 Alive 2017-04-11 2017-04-11
7 BHS_1169 2017-04-11 Alive 2017-04-11 2017-04-11
8 BHS_1259 <NA> <NA> <NA> <NA>
9 BHS_1260 <NA> <NA> <NA> <NA>
10 BHS_1262 2017-04-11 MIA 2017-04-11 2017-12-06
11 BHS_1262 2017-07-05 Alive 2017-04-11 2017-12-06
12 BHS_1262 2017-12-06 Alive 2017-04-11 2017-12-06
13 BHS_1262 2017-12-06 MIA 2017-04-11 2017-12-06
14 BHS_1262 2018-01-17 Mort 2017-04-11 2017-12-06
Warning messages:
1: In max.default(numeric(0), na.rm = FALSE) :
no non-missing arguments to max; returning -Inf
2: In max.default(numeric(0), na.rm = FALSE) :
no non-missing arguments to max; returning -Inf
代码似乎按预期工作,但我无法解释或避免错误,并且无法通过 max
或 min
帮助页面找到解决方案。可重现的代码包含在下面。
dat <- structure(list(ID = c("BHS_1149", "BHS_1154", "BHS_1155", "BHS_1156",
"BHS_1157", "BHS_1159", "BHS_1169", "BHS_1259", "BHS_1260", "BHS_1262",
"BHS_1262", "BHS_1262", "BHS_1262", "BHS_1262"), Date = structure(c(1491890400,
NA, NA, NA, NA, 1491890400, 1491890400, NA, NA, 1491890400, 1499234400,
1512543600, 1512543600, 1516172400), class = c("POSIXct", "POSIXt"
), tzone = ""), Fate = c("MIA", NA, NA, NA, "Mort", "Alive",
"Alive", NA, NA, "MIA", "Alive", "Alive", "MIA", "Mort")), row.names = c(NA,
-14L), .Names = c("ID", "Date", "Fate"), class = "data.frame")
我也喜欢编写不会给我错误的代码。这是关于如何在没有警告的情况下进行相同计算的建议。通过使用有序的 first 和 last 而不是 min 和 max 你不会得到 r interpret max(NULL) 变成 Inf.
的奇怪场景
dat %>%
group_by(ID) %>%
mutate(FstSurvey = first(Date,
order_by = Date),
LstAlive = last(Date[Fate == "Alive"],
order_by = Date[Fate == "Alive"]))
下面的数据包含个人 ID 列(重复观察),Date
和 Fate
。
ID Date Fate
1 BHS_1149 2017-04-11 MIA
2 BHS_1154 <NA> <NA>
3 BHS_1155 <NA> <NA>
4 BHS_1156 <NA> <NA>
5 BHS_1157 <NA> Mort
6 BHS_1159 2017-04-11 Alive
7 BHS_1169 2017-04-11 Alive
8 BHS_1259 <NA> <NA>
9 BHS_1260 <NA> <NA>
10 BHS_1262 2017-04-11 MIA
11 BHS_1262 2017-07-05 Alive
12 BHS_1262 2017-12-06 Alive
13 BHS_1262 2017-12-06 MIA
14 BHS_1262 2018-01-17 Mort
对于每个 ID,我想创建一个新列来表示当 Fate
处于活动状态时的最小值 Date
或最大值 Date
。如果在下面的代码中包含和排除 na.rm = T
参数,我尝试了不同的组合,但仍然收到以下警告。
library(tidyverse)
library(lubridate)
dat %>%
group_by(ID) %>%
mutate(
#the first or min of Date
FstSurvey = min(Date),
LstAlive = max(Date[Fate == "Alive"])) %>%
as.data.frame()
ID Date Fate FstSurvey LstAlive
1 BHS_1149 2017-04-11 MIA 2017-04-11 <NA>
2 BHS_1154 <NA> <NA> <NA> <NA>
3 BHS_1155 <NA> <NA> <NA> <NA>
4 BHS_1156 <NA> <NA> <NA> <NA>
5 BHS_1157 <NA> Mort <NA> <NA>
6 BHS_1159 2017-04-11 Alive 2017-04-11 2017-04-11
7 BHS_1169 2017-04-11 Alive 2017-04-11 2017-04-11
8 BHS_1259 <NA> <NA> <NA> <NA>
9 BHS_1260 <NA> <NA> <NA> <NA>
10 BHS_1262 2017-04-11 MIA 2017-04-11 2017-12-06
11 BHS_1262 2017-07-05 Alive 2017-04-11 2017-12-06
12 BHS_1262 2017-12-06 Alive 2017-04-11 2017-12-06
13 BHS_1262 2017-12-06 MIA 2017-04-11 2017-12-06
14 BHS_1262 2018-01-17 Mort 2017-04-11 2017-12-06
Warning messages:
1: In max.default(numeric(0), na.rm = FALSE) :
no non-missing arguments to max; returning -Inf
2: In max.default(numeric(0), na.rm = FALSE) :
no non-missing arguments to max; returning -Inf
代码似乎按预期工作,但我无法解释或避免错误,并且无法通过 max
或 min
帮助页面找到解决方案。可重现的代码包含在下面。
dat <- structure(list(ID = c("BHS_1149", "BHS_1154", "BHS_1155", "BHS_1156",
"BHS_1157", "BHS_1159", "BHS_1169", "BHS_1259", "BHS_1260", "BHS_1262",
"BHS_1262", "BHS_1262", "BHS_1262", "BHS_1262"), Date = structure(c(1491890400,
NA, NA, NA, NA, 1491890400, 1491890400, NA, NA, 1491890400, 1499234400,
1512543600, 1512543600, 1516172400), class = c("POSIXct", "POSIXt"
), tzone = ""), Fate = c("MIA", NA, NA, NA, "Mort", "Alive",
"Alive", NA, NA, "MIA", "Alive", "Alive", "MIA", "Mort")), row.names = c(NA,
-14L), .Names = c("ID", "Date", "Fate"), class = "data.frame")
我也喜欢编写不会给我错误的代码。这是关于如何在没有警告的情况下进行相同计算的建议。通过使用有序的 first 和 last 而不是 min 和 max 你不会得到 r interpret max(NULL) 变成 Inf.
的奇怪场景dat %>%
group_by(ID) %>%
mutate(FstSurvey = first(Date,
order_by = Date),
LstAlive = last(Date[Fate == "Alive"],
order_by = Date[Fate == "Alive"]))