当该行的行号等于R中具有重复项的另一列的值时如何选择该行？

Question

我有一个数据框如下-

df <- cbind(c(1,1,1,2,2,2,3,3,3,3), c(6,12,18,3,9,12,4,8,12,16),c(3,3,3,2,2,2,4,4,4,4))
colnames(df) <- c("ID","value","index")

我想得到以下结果 -

df1 <- cbind(c(1,2,3), c(18,9,16),c(3,2,4))

所以我基本上想提取（对于每个 ID）行号等于该 ID 索引的行。例如，ID 1 的第 3 行，ID 2 的第 2 行和 ID 4 的第 4 行。

我尝试了以下代码

df1 <- df%>%group_by(ID)%>%filter(index==index)

但它不起作用。请帮我解决这个问题。

Answer 1

使用 slice 到 select 每个 ID 的 index 行。

library(dplyr)
df %>% group_by(ID) %>% slice(first(index)) %>% ungroup

#     ID value index
#  <dbl> <dbl> <dbl>
#1     1    18     3
#2     2     9     2
#3     3    16     4

这可以用 data.table 和基数 R 写成：

library(data.table)
setDT(df)[, .SD[first(index)], ID]

#Base R
subset(df, index == ave(value, ID, FUN = seq_along))

数据

df <- data.frame(ID = c(1,1,1,2,2,2,3,3,3,3), 
                 value = c(6,12,18,3,9,12,4,8,12,16),
                 index = c(3,3,3,2,2,2,4,4,4,4))

Answer 2

只是添加到 Ronak Shah 的回答中，我想执行您想要的操作的最简单代码之一如下：

library(dplyr)
df <- 
    data.frame(ID = c(1,1,1,2,2,2,3,3,3,3), value = c(6,12,18,3,9,12,4,8,12,16), index = c(3,3,3,2,2,2,4,4,4,4))

df %>% group_by(ID) %>% filter(row_number() == index) %>% ungroup

当该行的行号等于R中具有重复项的另一列的值时如何选择该行？

how to choose a row when the row number of that row is equal to the value of the other column with duplicates in R?

row

r

row-number