将四列特殊连接到 R 中的新两列

Special join of four columns into new two ones in R

我在 R 工作,我遇到了一个有趣的问题。 我想转换下一个数据框:

DF = data.frame(ID = c(1, 2, 3),
              Person1 = c("Devin Davey", "Rui Butt", "Keon Dotson"),
              Sign = "artist",
              Person2 = c("Eli Greer", "Alvin Simons", "Leona Ford"),
              Sex = c("female", "male", "female"),
              Score = c(10, 20, 30)) 



  ID     Person1   Sign      Person2    Sex Score
1  1 Devin Davey artist    Eli Greer female    10
2  2    Rui Butt artist Alvin Simons   male    20
3  3 Keon Dotson artist   Leona Ford female    30

格式如下:

  ID         Name   Sign Score
1  1  Devin Davey artist    10
2  1    Eli Greer female    10
3  2     Rui Butt artist    20
4  2 Alvin Simons   male    20
5  3  Keon Dotson artist    30
6  3   Leona Ford female    30

也就是说,有一个特殊的将四列连接成两个新列的方法。

我有如下想法:

PART1 <- DF %>% 
            select(ID, Person1, Person2, Score) %>%
            gather(key, Name, -c(ID, Score), na.rm = TRUE) %>%
            select(-key) %>%
            arrange(ID) %>%
            mutate(temp_id = 1:n())

PART2 <- DF %>% 
            select(ID, Sign, Sex) %>%
            gather(key, Sign, -ID, na.rm = TRUE) %>%
            select(-key) %>%
            arrange(ID) %>%
            mutate(temp_id = 1:n())

PART1 %>%
        left_join(PART2, by = c("ID" = "ID", "temp_id" = "temp_id")) %>%
        select(-temp_id) %>%
        relocate(Score, .after = Sign)

但是我觉得这样的解决方案不是很漂亮,我觉得这个问题可以用更好的方式解决。

因此,如果您提出使用 tidyverse 解决此问题的想法,我将不胜感激。

我们可以将名称从 'Sign'、'Sex' 更改为通用名称 'Sign',并附加一个序列作为后缀以与 Person 匹配,然后使用 pivot_longer

library(dplyr)
library(tidyr)
DF %>% 
   rename_at(vars(c('Sign', 'Sex')), ~ c('Sign1', 'Sign2')) %>% 
   pivot_longer(cols = -c(ID, Score), names_to = c(".value", "grp"), 
        names_sep = "(?<=[a-z])(?=\d)") %>%
   select(ID, Name = Person, Sign, Score)

-输出

# A tibble: 6 x 4
#     ID Name         Sign   Score
#  <dbl> <chr>        <chr>  <dbl>
#1     1 Devin Davey  artist    10
#2     1 Eli Greer    female    10
#3     2 Rui Butt     artist    20
#4     2 Alvin Simons male      20
#5     3 Keon Dotson  artist    30
#6     3 Leona Ford   female    30

在基础 R 中,您可以使用函数 reshape。由于这给出了不同的排序,我们将重新排序以获得如上所述的确切数据。虽然没有必要

DF1<-reshape(DF, matrix(2:5, 2), dir="long")
DF1[order(DF1$ID),c("ID", "Person1","Sign", "Score")]

    ID      Person1   Sign Score
1.1  1  Devin Davey artist    10
1.2  1    Eli Greer female    10
2.1  2     Rui Butt artist    20
2.2  2 Alvin Simons   male    20
3.1  3  Keon Dotson artist    30
3.2  3   Leona Ford female    30

您可以显式 select 列名并使用 bind_rows

library(tidyverse)
bind_rows(DF %>% select(ID, Name = Person1, Sign = Sex, Score),
          DF %>% select(ID, Name = Person2, Sign, Score)) %>% 
  arrange(ID)
#>   ID         Name   Sign Score
#> 1  1  Devin Davey female    10
#> 2  1    Eli Greer artist    10
#> 3  2     Rui Butt   male    20
#> 4  2 Alvin Simons artist    20
#> 5  3  Keon Dotson female    30
#> 6  3   Leona Ford artist    30

full_join

library(tidyverse)
DF %>% select(ID, Name = Person1, Sign = Sex, Score) %>% 
  full_join(DF %>% select(ID, Name = Person2, Sign, Score)) %>% 
  arrange(ID)
#> Joining, by = c("ID", "Name", "Sign", "Score")
#>   ID         Name   Sign Score
#> 1  1  Devin Davey female    10
#> 2  1    Eli Greer artist    10
#> 3  2     Rui Butt   male    20
#> 4  2 Alvin Simons artist    20
#> 5  3  Keon Dotson female    30
#> 6  3   Leona Ford artist    30