如何根据 R 中的时间分隔列?
How do separate columns depending on time in R?
我们是研究生,对 R 有点吃力。
首先,我们尝试将上午和下午添加到我们的数据中。早上是时间 <= 57.
我们有一个数据集:d
box year day time nVisit visit morning
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl>
1 212 2020 243 75 0 0 FALSE
2 212 2020 243 76 0 0 FALSE
3 212 2020 243 77 0 0 FALSE
4 212 2020 243 78 0 0 FALSE
5 212 2020 243 79 0 0 FALSE '
但我们想要的是按日期和框分组的单独列中的访问总和。
所以我们添加了这块:
d3 <- d %>%
group_by(box, day, morning ) %>%
summarise_at(vars("visit"), sum)
d3$frac <- d3$visit/57
但现在我们不知道如何为上午和下午的计数单独列。
如果你知道如何提供帮助那就太好了!
提前致谢,
2名研究生
试试这个:
library(dplyr)
library(tidyr) # pivot_wider
randdat %>%
mutate(period = if_else(time < 57, "morning", "afternoon")) %>%
group_by(box, day, period) %>%
summarize(visit = sum(visit)) %>%
pivot_wider(box:day, names_from = "period", values_from = "visit") %>%
ungroup()
# # A tibble: 2 x 4
# box day afternoon morning
# <dbl> <int> <int> <int>
# 1 212 243 2 2
# 2 212 244 3 0
数据:
set.seed(2021)
randdat <- tibble(
box = 212,
year = 2020,
day = sample(243:244, size = 20, replace = TRUE),
time = sample(round(runif(20, 0, 100), 0), size = 20),
visit = sample(0:1, size = 20, replace = TRUE)
) %>% arrange(box, year, day, time)
没有数据样本我可以给你以下建议:
library(dplyr)
library(tidyr)
d <- d %>%
mutate(day_part = case_when(
time <= 57 ~ "morning",
time > 57 & time <= 114 ~ "afternoon",
time > 114 ~ "evening"
)) %>%
group_by(box, day, day_part) %>%
summarise(visits = sum(visit, na.rm = TRUE)) %>%
ungroup() %>%
pivot_wider(names_from = day_part, values_from = visits)
我们是研究生,对 R 有点吃力。
首先,我们尝试将上午和下午添加到我们的数据中。早上是时间 <= 57.
我们有一个数据集:d
box year day time nVisit visit morning
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl>
1 212 2020 243 75 0 0 FALSE
2 212 2020 243 76 0 0 FALSE
3 212 2020 243 77 0 0 FALSE
4 212 2020 243 78 0 0 FALSE
5 212 2020 243 79 0 0 FALSE '
但我们想要的是按日期和框分组的单独列中的访问总和。 所以我们添加了这块:
d3 <- d %>%
group_by(box, day, morning ) %>%
summarise_at(vars("visit"), sum)
d3$frac <- d3$visit/57
但现在我们不知道如何为上午和下午的计数单独列。
如果你知道如何提供帮助那就太好了!
提前致谢, 2名研究生
试试这个:
library(dplyr)
library(tidyr) # pivot_wider
randdat %>%
mutate(period = if_else(time < 57, "morning", "afternoon")) %>%
group_by(box, day, period) %>%
summarize(visit = sum(visit)) %>%
pivot_wider(box:day, names_from = "period", values_from = "visit") %>%
ungroup()
# # A tibble: 2 x 4
# box day afternoon morning
# <dbl> <int> <int> <int>
# 1 212 243 2 2
# 2 212 244 3 0
数据:
set.seed(2021)
randdat <- tibble(
box = 212,
year = 2020,
day = sample(243:244, size = 20, replace = TRUE),
time = sample(round(runif(20, 0, 100), 0), size = 20),
visit = sample(0:1, size = 20, replace = TRUE)
) %>% arrange(box, year, day, time)
没有数据样本我可以给你以下建议:
library(dplyr)
library(tidyr)
d <- d %>%
mutate(day_part = case_when(
time <= 57 ~ "morning",
time > 57 & time <= 114 ~ "afternoon",
time > 114 ~ "evening"
)) %>%
group_by(box, day, day_part) %>%
summarise(visits = sum(visit, na.rm = TRUE)) %>%
ungroup() %>%
pivot_wider(names_from = day_part, values_from = visits)