使用 tidyverse 进行更广泛的旋转
Pivot wider with tidyverse
我有一个像这个示例数据一样的长日期集
df <- tibble::tribble(
~V1, ~V2, ~V3,
1L, "Hig", 3131000L,
2L, "Hig", 279000L,
3L, "Hig", 1316000L,
1L, "val", 1882000L,
2L, "val", 1433000L,
3L, "val", 555000L,
4L, "val", 856000L,
1L, "nt", 4493000L,
2L, "nt", 233000L,
3L, "nt", 693000L
)
我想将其更改为宽格式而不考虑 V1 变量和列名称。期望的输出应该是这样的(导出为 txt 文件)
High,3131000,279000,1316000,
val,1882000,1433000,555000,856000,
nt,4493000,233000,693000,
我只是试了一下
df %>%
select(V2, V3) %>%
pivot_wider("V2", values_from="V3" )
Error: Column 1 must be named.
Use .name_repair to specify repair.
问题是您需要从某处获取列名。
你可以这样做
library(tidyverse)
df <- tibble::tribble(
~V1, ~V2, ~V3,
1L, "Hig", 3131000L,
2L, "Hig", 279000L,
3L, "Hig", 1316000L,
1L, "val", 1882000L,
2L, "val", 1433000L,
3L, "val", 555000L,
4L, "val", 856000L,
1L, "nt", 4493000L,
2L, "nt", 233000L,
3L, "nt", 693000L
)
df %>%
select(V2, V3) %>%
group_by(V2) %>%
mutate(temp = row_number()) %>%
pivot_wider(id_cols = V2, names_from=temp, values_from = V3)
哪个会给你
# A tibble: 3 x 5
# Groups: V2 [3]
V2 `1` `2` `3` `4`
<chr> <int> <int> <int> <int>
1 Hig 3131000 279000 1316000 NA
2 val 1882000 1433000 555000 856000
3 nt 4493000 233000 693000 NA
由于输出中的行长度不同,我不会使用 dplyr
的 pivot_
函数。它们最好以表格格式使用。
相反,您可以 split
您的数据 V2
和 paste
该名称以及来自 V3
的值。最后,将结果写入文件 output.txt
,如下所示:
library(purrr)
split(df, ~V2) %>%
imap_chr(~ paste(c(.y, .x$V3, ""), collapse = ",")) %>%
cat(file = "output.txt", sep = "\n")
结果 output.txt
:
file.show("output.txt")
Hig,3131000,279000,1316000,
nt,4493000,233000,693000,
val,1882000,1433000,555000,856000,
我有一个像这个示例数据一样的长日期集
df <- tibble::tribble(
~V1, ~V2, ~V3,
1L, "Hig", 3131000L,
2L, "Hig", 279000L,
3L, "Hig", 1316000L,
1L, "val", 1882000L,
2L, "val", 1433000L,
3L, "val", 555000L,
4L, "val", 856000L,
1L, "nt", 4493000L,
2L, "nt", 233000L,
3L, "nt", 693000L
)
我想将其更改为宽格式而不考虑 V1 变量和列名称。期望的输出应该是这样的(导出为 txt 文件)
High,3131000,279000,1316000,
val,1882000,1433000,555000,856000,
nt,4493000,233000,693000,
我只是试了一下
df %>%
select(V2, V3) %>%
pivot_wider("V2", values_from="V3" )
Error: Column 1 must be named.
Use .name_repair to specify repair.
问题是您需要从某处获取列名。
你可以这样做
library(tidyverse)
df <- tibble::tribble(
~V1, ~V2, ~V3,
1L, "Hig", 3131000L,
2L, "Hig", 279000L,
3L, "Hig", 1316000L,
1L, "val", 1882000L,
2L, "val", 1433000L,
3L, "val", 555000L,
4L, "val", 856000L,
1L, "nt", 4493000L,
2L, "nt", 233000L,
3L, "nt", 693000L
)
df %>%
select(V2, V3) %>%
group_by(V2) %>%
mutate(temp = row_number()) %>%
pivot_wider(id_cols = V2, names_from=temp, values_from = V3)
哪个会给你
# A tibble: 3 x 5
# Groups: V2 [3]
V2 `1` `2` `3` `4`
<chr> <int> <int> <int> <int>
1 Hig 3131000 279000 1316000 NA
2 val 1882000 1433000 555000 856000
3 nt 4493000 233000 693000 NA
由于输出中的行长度不同,我不会使用 dplyr
的 pivot_
函数。它们最好以表格格式使用。
相反,您可以 split
您的数据 V2
和 paste
该名称以及来自 V3
的值。最后,将结果写入文件 output.txt
,如下所示:
library(purrr)
split(df, ~V2) %>%
imap_chr(~ paste(c(.y, .x$V3, ""), collapse = ",")) %>%
cat(file = "output.txt", sep = "\n")
结果 output.txt
:
file.show("output.txt")
Hig,3131000,279000,1316000,
nt,4493000,233000,693000,
val,1882000,1433000,555000,856000,