反转 columns/Change 观察单位
Invert columns/Change unit of observation
我正在处理如下所示的数据:
CATEGORY PARENT PROPERTY `2018 DOLS` `2018 UNITS` `2017 DOLS` `2017 UNITS`
xxx A P1 100 1 200 2
xxx A P2 NA NA 200 1
xxx B P3 300 1 NA NA
如您所见,PARENT 列中的观察结果可能会重复多次。我想通过以下方式将此数据框转换为 PARENT-YEAR 级别的面板:
CATEGORY PARENT YEAR `P1 DOLS` `P1 UNITS` `P2 DOLS` `P2 UNITS` `P3 DOLS` `P3 UNITS`
xxx A 2017 200 2 200 1 0 0
xxx A 2018 100 1 0 0 0 0
xxx B 2018 0 0 0 0 300 1
请注意,这实际上相当于反转年份和 属性 条目之间的列和道路(同时忽略 NA)。我想知道执行此任务的最有效方法是什么?谢谢!
使用 tidyr
的 pivot_longer
和 pivot_wider
。
library(tidyr)
df %>%
pivot_longer(cols = matches('^\d+'),
names_to = c('Year', 'name'),
names_sep = '\s+',
values_drop_na = TRUE) %>%
pivot_wider(names_from = c(PROPERTY, name), values_from = value,
values_fill = 0, names_sep = ' ')
# CATEGORY PARENT Year `P1 DOLS` `P1 UNITS` `P2 DOLS` `P2 UNITS` `P3 DOLS` `P3 UNITS`
# <chr> <chr> <chr> <int> <int> <int> <int> <int> <int>
#1 xxx A 2018 100 1 0 0 0 0
#2 xxx A 2017 200 2 200 1 0 0
#3 xxx B 2018 0 0 0 0 300 1
数据
df <- structure(list(CATEGORY = c("xxx", "xxx", "xxx"), PARENT = c("A",
"A", "B"), PROPERTY = c("P1", "P2", "P3"), `2018 DOLS` = c(100L,
NA, 300L), `2018 UNITS` = c(1L, NA, 1L), `2017 DOLS` = c(200L,
200L, NA), `2017 UNITS` = c(2L, 1L, NA)),
class = "data.frame", row.names = c(NA, -3L))
我正在处理如下所示的数据:
CATEGORY PARENT PROPERTY `2018 DOLS` `2018 UNITS` `2017 DOLS` `2017 UNITS`
xxx A P1 100 1 200 2
xxx A P2 NA NA 200 1
xxx B P3 300 1 NA NA
如您所见,PARENT 列中的观察结果可能会重复多次。我想通过以下方式将此数据框转换为 PARENT-YEAR 级别的面板:
CATEGORY PARENT YEAR `P1 DOLS` `P1 UNITS` `P2 DOLS` `P2 UNITS` `P3 DOLS` `P3 UNITS`
xxx A 2017 200 2 200 1 0 0
xxx A 2018 100 1 0 0 0 0
xxx B 2018 0 0 0 0 300 1
请注意,这实际上相当于反转年份和 属性 条目之间的列和道路(同时忽略 NA)。我想知道执行此任务的最有效方法是什么?谢谢!
使用 tidyr
的 pivot_longer
和 pivot_wider
。
library(tidyr)
df %>%
pivot_longer(cols = matches('^\d+'),
names_to = c('Year', 'name'),
names_sep = '\s+',
values_drop_na = TRUE) %>%
pivot_wider(names_from = c(PROPERTY, name), values_from = value,
values_fill = 0, names_sep = ' ')
# CATEGORY PARENT Year `P1 DOLS` `P1 UNITS` `P2 DOLS` `P2 UNITS` `P3 DOLS` `P3 UNITS`
# <chr> <chr> <chr> <int> <int> <int> <int> <int> <int>
#1 xxx A 2018 100 1 0 0 0 0
#2 xxx A 2017 200 2 200 1 0 0
#3 xxx B 2018 0 0 0 0 300 1
数据
df <- structure(list(CATEGORY = c("xxx", "xxx", "xxx"), PARENT = c("A",
"A", "B"), PROPERTY = c("P1", "P2", "P3"), `2018 DOLS` = c(100L,
NA, 300L), `2018 UNITS` = c(1L, NA, 1L), `2017 DOLS` = c(200L,
200L, NA), `2017 UNITS` = c(2L, 1L, NA)),
class = "data.frame", row.names = c(NA, -3L))