如何根据 R 中的列值 select（四）特定行（多次）？

Question

我只想 select 我的数据框中所有年份的 ID，从 2013 年到 2016 年（四次）。在这种情况下，只剩下四行的 ID（面板数据，每个 ID 每年有 1 行）。我已经确保我的数据框只涵盖我需要的年份（2013、2014、2015 和 2016），但我想排除我的数据框中少于 4 years/rows 的 ID。

这是我的数据框的结构：

 tibble [909,587 x 26] (S3: tbl_df/tbl/data.frame)
     $ ID                         : num [1:909587] 12 12 12 12 16 16 16 16...
     $ Gender                     : num [1:909587] 2 2 2 2 1 1 1 1 1 1 ...
      ..- attr(*, "format.spss")= chr "F10.0"
     $ Year                       : chr [1:909587] "2016" "2013" "2014" "2015" ...
      ..- attr(*, "format.spss")= chr "F9.3"
     $ Size                       : num [1:909587] 1983 1999 1951 1976 902 ...
     $ Costs                      : num [1:909587] 2957.47 0 0.34 1041.67 0 ...
     $ Urbanisation               : num [1:909587] 2 3 3 2 3 3 2 2 2 3 ...
     $ Age                        : num [1:909587] 92 89 90 91 82 83 22 23 24 65 ...

我怎样才能做到这一点？

谢谢！

Answer 1

旋转你的 df

df %>% pivot_wider(names_from = Year,values_from = Age)

从 2013,2014,2015,2016 列中过滤出 na 的行

向后旋转

df %>% pivot_longer(2013:2016)

Answer 2

只是为了从上面的评论字段中获取@Jasonaizkains 的回答，因为在这种情况下对于某些播放数据，旋转并不是绝对必要的。

library(dplyr)
id <- rep(10:13, 4) # four subjects
year <- rep(2013:2016, each = 4) # four years
gender <- sample(1:2, 16, replace = TRUE)
play <- tibble(id, gender, year) # data.frame of 16

play <- play[-9,] # removes row for id 10 in 2015

# Removes all entries for the right id number
play %>% group_by(id) %>% filter(n_distinct(year) >= 4) %>% ungroup()
#> # A tibble: 12 x 3
#>       id gender  year
#>    <int>  <int> <int>
#>  1    11      1  2013
#>  2    12      2  2013
#>  3    13      2  2013
#>  4    11      1  2014
#>  5    12      2  2014
#>  6    13      1  2014
#>  7    11      2  2015
#>  8    12      2  2015
#>  9    13      2  2015
#> 10    11      2  2016
#> 11    12      2  2016
#> 12    13      1  2016

如何根据 R 中的列值 select（四）特定行（多次）？

How to select (four) specific rows (multiple times) based on a column value in R?

row

r

dataframe

dplyr

tidyverse