基于多列重塑数据集

reshape dataset based on multiple colums

我之前对数据进行了整形,但是单元格总是可以通过两个变量来识别。这对我当前的数据来说是不可能的。我的数据摘录如下所示。完整数据集涵盖更多国家和年份。

国家 对犯罪的恐惧 总计 2007 2009 2010
阿根廷 全部或几乎所有时间 37 37 33 27
阿根廷 有时 34 42 35 40
阿根廷 偶尔 18 14 23 23
阿根廷 从不 11 6 8 10
阿根廷 不要know/No回答 0 1 1 0
玻利维亚 全部或几乎所有时间 38 35 36 34
玻利维亚 有时 36 40 41 40
玻利维亚 偶尔 17 17 18 18
玻利维亚 从不 8 6 4 6
玻利维亚 不要know/No回答 1 1 0 1

我需要这种格式的数据:

国家 全部或几乎所有时间 有时 偶尔 从不 不要know/No回答

有人有解决办法吗?非常感谢!

library(dplyr)
library(tidyr)

dat %>%
    pivot_longer(
        cols = -c(Country, `Fear of Crime`),
        names_to = "Year"
    ) %>%
    pivot_wider(
        id_cols = c(Year, Country),
        names_from = `Fear of Crime`,
        values_from = value
    )

# A tibble: 6 x 7
#  Year  Country     All Sometimes Occasionally Never `Don't know`
#  <chr> <chr>     <dbl>     <dbl>        <dbl> <dbl>        <dbl>
#1 2007  Argentina  52.0      29.7         52.1  34.2         59.9
#2 2009  Argentina  52.8      38.1         42.0  73.5         42.9
#3 2010  Argentina  56.2      64.6         31.0  71.6         32.1
#4 2007  Bolivia    36.8      37.4         31.4  45.0         56.3
#5 2009  Bolivia    53.2      52.8         62.8  56.1         59.9
#6 2010  Bolivia    42.4      45.1         67.4  55.0         58.1

数据:

dat <- tibble(
    Country = rep(c("Argentina", "Bolivia"), each = 5),
    `Fear of Crime` = rep(c("All", "Sometimes", "Occasionally", "Never", "Don't know"), 2),
    `2007` = rnorm(10, 50, 10),
    `2009` = rnorm(10, 50, 10),
    `2010` = rnorm(10, 50, 10)
)

您也可以使用以下解决方案。我将 TOTAL 值添加到 Year 变量,并对 Fear_of_Crime 进行了轻微更改,以便所有值都是 title_case:

library(tidyr)
library(stringr)

df %>%
  pivot_longer(TOTAL:X2010, names_to = "Year", names_prefix = "X?") %>%
  mutate(Fear_of_Crime = str_to_title(Fear_of_Crime)) %>%
  pivot_wider(names_from = Fear_of_Crime, values_from = value)

# A tibble: 8 x 7
  Country   Year  All_or_almost_the_time Sometimes Occasionally Never `Don´T_know/No_answer`
  <chr>     <chr>                  <int>     <int>        <int> <int>                  <int>
1 Argentina TOTAL                     37        34           18    11                      0
2 Argentina 2007                      37        42           14     6                      1
3 Argentina 2009                      33        35           23     8                      1
4 Argentina 2010                      27        40           23    10                      0
5 Bolivia   TOTAL                     38        36           17     8                      1
6 Bolivia   2007                      35        40           17     6                      1
7 Bolivia   2009                      36        41           18     4                      0
8 Bolivia   2010                      34        40           18     6                      1

使用data.table-

library(data.table)

dcast(melt(setDT(df), id.vars = c('Country', 'Fear of Crime')), 
      Country + variable ~ `Fear of Crime` , value.var = 'value')

#     Country variable      All Don't know    Never Occasionally Sometimes
#1: Argentina     2007 61.35123   51.64059 52.90937     56.06212  51.27404
#2: Argentina     2009 49.97756   48.41825 67.63133     35.55390  46.89938
#3: Argentina     2010 54.35360   57.17569 43.49386     54.01240  59.79714
#4: Argentina    Total 77.11749   66.08187 57.24466     82.39351  89.21991
#5:   Bolivia     2007 53.28061   49.66029 39.45862     54.87632  36.10037
#6:   Bolivia     2009 32.22393   44.43537 56.89622     58.62973  40.09476
#7:   Bolivia     2010 43.81035   43.07929 39.85770     57.56582  49.35075
#8:   Bolivia    Total 97.55278   57.06825 72.95792     87.52021  59.40992