如何按组获取变量的所有最小值?
How can you get all the minimum values of a variable by group?
我有一个数据框:
df<-data.frame(P = c("A","A","A", "B","B","B", "C", "C", "C"),
index = c("ind1","ind2","ind3","ind1","ind2","ind3","ind1","ind2","ind3"),
var = c(2,1,1,8,5,4,2,8,6))
我想获得 var
的所有最小值及其关联的 index
P
的每个值。
我可以做到:
DT <- data.table(df)
DT[ ,.SD[which.min(var)], by = P]
只给出var
的一个最小值(第一个)P
:
P index var
1: A ind2 1
2: B ind3 4
3: C ind1 2
我想:
P index var
1: A ind2 1
2: A ind3 1
2: B ind3 4
3: C ind1 2
想法?
在 which.min
的帮助页面上,您会注意到它说:
Determines the location, i.e., index of the (first) minimum or maximum of a numeric (or logical) vector.
如果您希望所有值都匹配最小值,您应该尝试使用 ==
。因此,继续您的方法,尝试:
DT[, .SD[var == min(var)], by = P]
## P index var
## 1: A ind2 1
## 2: A ind3 1
## 3: B ind3 4
## 4: C ind1 2
使用 dplyr,您可以使用以下之一:
library(dplyr)
DT %>% group_by(P) %>% filter(var == min(var)) # or %in% instead of ==
#Source: local data table [4 x 3]
#Groups: P
#
# P index var
# (fctr) (fctr) (dbl)
#1 A ind2 1
#2 A ind3 1
#3 B ind3 4
#4 C ind1 2
或者
DT %>% group_by(P) %>% top_n(1, desc(var)) # top_n() returns multiple rows in case of ties
#Source: local data table [4 x 3]
#Groups: P
#
# P index var
# (fctr) (fctr) (dbl)
#1 A ind2 1
#2 A ind3 1
#3 B ind3 4
#4 C ind1 2
或者
DT %>% group_by(P) %>% filter(min_rank(var) == 1)
#Source: local data table [4 x 3]
#Groups: P
#
# P index var
# (fctr) (fctr) (dbl)
#1 A ind2 1
#2 A ind3 1
#3 B ind3 4
#4 C ind1 2
我有一个数据框:
df<-data.frame(P = c("A","A","A", "B","B","B", "C", "C", "C"),
index = c("ind1","ind2","ind3","ind1","ind2","ind3","ind1","ind2","ind3"),
var = c(2,1,1,8,5,4,2,8,6))
我想获得 var
的所有最小值及其关联的 index
P
的每个值。
我可以做到:
DT <- data.table(df)
DT[ ,.SD[which.min(var)], by = P]
只给出var
的一个最小值(第一个)P
:
P index var 1: A ind2 1 2: B ind3 4 3: C ind1 2
我想:
P index var 1: A ind2 1 2: A ind3 1 2: B ind3 4 3: C ind1 2
想法?
在 which.min
的帮助页面上,您会注意到它说:
Determines the location, i.e., index of the (first) minimum or maximum of a numeric (or logical) vector.
如果您希望所有值都匹配最小值,您应该尝试使用 ==
。因此,继续您的方法,尝试:
DT[, .SD[var == min(var)], by = P]
## P index var
## 1: A ind2 1
## 2: A ind3 1
## 3: B ind3 4
## 4: C ind1 2
使用 dplyr,您可以使用以下之一:
library(dplyr)
DT %>% group_by(P) %>% filter(var == min(var)) # or %in% instead of ==
#Source: local data table [4 x 3]
#Groups: P
#
# P index var
# (fctr) (fctr) (dbl)
#1 A ind2 1
#2 A ind3 1
#3 B ind3 4
#4 C ind1 2
或者
DT %>% group_by(P) %>% top_n(1, desc(var)) # top_n() returns multiple rows in case of ties
#Source: local data table [4 x 3]
#Groups: P
#
# P index var
# (fctr) (fctr) (dbl)
#1 A ind2 1
#2 A ind3 1
#3 B ind3 4
#4 C ind1 2
或者
DT %>% group_by(P) %>% filter(min_rank(var) == 1)
#Source: local data table [4 x 3]
#Groups: P
#
# P index var
# (fctr) (fctr) (dbl)
#1 A ind2 1
#2 A ind3 1
#3 B ind3 4
#4 C ind1 2