如果数据帧每个代码包含多个事件,如何将时间代码转换为使用 dplyr 转换代码
how to transform time codes to turn codes with dplyr if dataframe includes more than one event per code
我想像这样转换时间码
library(lubridate)
library(tidyverse)
df_time <- tibble(time = c(ymd_hms("2020_01_01 00:00:01"),
ymd_hms("2020_01_01 00:00:02"),
ymd_hms("2020_01_01 00:00:03"),
ymd_hms("2020_01_01 00:00:04"),
ymd_hms("2020_01_01 00:00:05"),
ymd_hms("2020_01_01 00:00:06"),
ymd_hms("2020_01_01 00:00:07"),
ymd_hms("2020_01_01 00:00:08"),
ymd_hms("2020_01_01 00:00:09"),
ymd_hms("2020_01_01 00:00:10")),
a = c(0, 1, 1, 1, 1, 0, 0, 1, 1, 0),
b = c(0, 0, 1, 1, 0, 1, 1, 1, 0, 0))
导致
> df_time
# A tibble: 10 x 3
time a b
<dttm> <dbl> <dbl>
1 2020-01-01 00:00:01 0 0
2 2020-01-01 00:00:02 1 0
3 2020-01-01 00:00:03 1 1
4 2020-01-01 00:00:04 1 1
5 2020-01-01 00:00:05 1 0
6 2020-01-01 00:00:06 0 1
7 2020-01-01 00:00:07 0 1
8 2020-01-01 00:00:08 1 1
9 2020-01-01 00:00:09 1 0
10 2020-01-01 00:00:10 0 0
转换为代码(a.k.a。事件代码/"start stop data")。应该类似于以下 df:
df_turn <- tibble(start = c(ymd_hms("2020_01_01 00:00:02"),
ymd_hms("2020_01_01 00:00:03"),
ymd_hms("2020_01_01 00:00:06"),
ymd_hms("2020_01_01 00:00:08")),
end = c(ymd_hms("2020_01_01 00:00:05"),
ymd_hms("2020_01_01 00:00:04"),
ymd_hms("2020_01_01 00:00:08"),
ymd_hms("2020_01_01 00:00:09")),
code = c("a", "b", "b", "a"))
> df_turn
# A tibble: 4 x 3
start end code
<dttm> <dttm> <chr>
1 2020-01-01 00:00:02 2020-01-01 00:00:05 a
2 2020-01-01 00:00:03 2020-01-01 00:00:04 b
3 2020-01-01 00:00:06 2020-01-01 00:00:08 b
4 2020-01-01 00:00:08 2020-01-01 00:00:09 a
这个很棒的 post 为每个代码的一个事件提供了解决方案,但不能超过一个。
谢谢!
我将使用 link 为类似任务提供此解决方案
df_time %>%
pivot_longer(-time) %>%
group_by(name) %>%
mutate(tmp = value - lag(value)) %>%
filter(value == 1) %>%
mutate(tmp = cumsum(tmp)) %>%
group_by(name, tmp) %>%
summarise(start = range(time)[1],
end = range(time)[2])
# A tibble: 4 x 4
# Groups: name [2]
name tmp start end
<chr> <dbl> <dttm> <dttm>
1 a 1 2020-01-01 00:00:02 2020-01-01 00:00:05
2 a 2 2020-01-01 00:00:08 2020-01-01 00:00:09
3 b 1 2020-01-01 00:00:03 2020-01-01 00:00:04
4 b 2 2020-01-01 00:00:06 2020-01-01 00:00:08
我想像这样转换时间码
library(lubridate)
library(tidyverse)
df_time <- tibble(time = c(ymd_hms("2020_01_01 00:00:01"),
ymd_hms("2020_01_01 00:00:02"),
ymd_hms("2020_01_01 00:00:03"),
ymd_hms("2020_01_01 00:00:04"),
ymd_hms("2020_01_01 00:00:05"),
ymd_hms("2020_01_01 00:00:06"),
ymd_hms("2020_01_01 00:00:07"),
ymd_hms("2020_01_01 00:00:08"),
ymd_hms("2020_01_01 00:00:09"),
ymd_hms("2020_01_01 00:00:10")),
a = c(0, 1, 1, 1, 1, 0, 0, 1, 1, 0),
b = c(0, 0, 1, 1, 0, 1, 1, 1, 0, 0))
导致
> df_time
# A tibble: 10 x 3
time a b
<dttm> <dbl> <dbl>
1 2020-01-01 00:00:01 0 0
2 2020-01-01 00:00:02 1 0
3 2020-01-01 00:00:03 1 1
4 2020-01-01 00:00:04 1 1
5 2020-01-01 00:00:05 1 0
6 2020-01-01 00:00:06 0 1
7 2020-01-01 00:00:07 0 1
8 2020-01-01 00:00:08 1 1
9 2020-01-01 00:00:09 1 0
10 2020-01-01 00:00:10 0 0
转换为代码(a.k.a。事件代码/"start stop data")。应该类似于以下 df:
df_turn <- tibble(start = c(ymd_hms("2020_01_01 00:00:02"),
ymd_hms("2020_01_01 00:00:03"),
ymd_hms("2020_01_01 00:00:06"),
ymd_hms("2020_01_01 00:00:08")),
end = c(ymd_hms("2020_01_01 00:00:05"),
ymd_hms("2020_01_01 00:00:04"),
ymd_hms("2020_01_01 00:00:08"),
ymd_hms("2020_01_01 00:00:09")),
code = c("a", "b", "b", "a"))
> df_turn
# A tibble: 4 x 3
start end code
<dttm> <dttm> <chr>
1 2020-01-01 00:00:02 2020-01-01 00:00:05 a
2 2020-01-01 00:00:03 2020-01-01 00:00:04 b
3 2020-01-01 00:00:06 2020-01-01 00:00:08 b
4 2020-01-01 00:00:08 2020-01-01 00:00:09 a
这个很棒的 post
谢谢!
我将使用 link 为类似任务提供此解决方案
df_time %>%
pivot_longer(-time) %>%
group_by(name) %>%
mutate(tmp = value - lag(value)) %>%
filter(value == 1) %>%
mutate(tmp = cumsum(tmp)) %>%
group_by(name, tmp) %>%
summarise(start = range(time)[1],
end = range(time)[2])
# A tibble: 4 x 4
# Groups: name [2]
name tmp start end
<chr> <dbl> <dttm> <dttm>
1 a 1 2020-01-01 00:00:02 2020-01-01 00:00:05
2 a 2 2020-01-01 00:00:08 2020-01-01 00:00:09
3 b 1 2020-01-01 00:00:03 2020-01-01 00:00:04
4 b 2 2020-01-01 00:00:06 2020-01-01 00:00:08