R:根据变量查找非唯一的整体
R: Find non-unique entires based on a variable
我在表达这一权利时遇到了问题,如果有任何编辑,我将不胜感激。
我有一个看起来像这样的 data frame
,显示各种卫星的运营商:
Operator Satellite
United SAT1
American SAT2
United SAT1
United SAT3
American SAT3
American SAT5
Delta SAT1
United SAT8
我正在尝试获得一个 data frame
给我的条目,对于由多个运营商运营的卫星,列出两个变量:
Operator Satellite
United SAT1
United SAT1
Delta SAT1
United SAT3
American SAT3
我能够解决这个问题的唯一方法是使用 which(file$operator)
循环,但这似乎是一种不必要的麻烦方法。
我将不胜感激任何帮助,并且对任何 packages
没有偏好。提前谢谢你。
这是一个data.table
单行方法
library( data.table )
DT <- fread("Operator Satellite
United SAT1
American SAT2
United SAT1
United SAT3
American SAT3
American SAT5
Delta SAT1
United SAT8")
DT[, if( .N > 1 ) .SD, by = Satellite]
# Satellite Operator
# 1: SAT1 United
# 2: SAT1 United
# 3: SAT1 Delta
# 4: SAT3 United
# 5: SAT3 American
你可以试试dplyr
解决办法:
library(dplyr)
df <- structure(list(Operator = c("United", "American", "United", "United",
"American", "American", "Delta", "United"), Satellite = c("SAT1",
"SAT2", "SAT1", "SAT3", "SAT3", "SAT5", "SAT1", "SAT8")), class = "data.frame", row.names = c(NA,
-8L))
#Code
df %>% left_join(df %>% group_by(Satellite) %>% summarise(N=n())) %>% filter(N>1)
Operator Satellite N
1 United SAT1 3
2 United SAT1 3
3 United SAT3 2
4 American SAT3 2
5 Delta SAT1 3
这是一种没有额外包的单行方法
dw <- read.table(header=T, text='
Operator Satellite
United SAT1
American SAT2
United SAT1
United SAT3
American SAT3
American SAT5
Delta SAT1
United SAT8
')
dw[ave(dw$Satellite, dw$Satellite, FUN = length) > 1,]
Operator Satellite
1 United SAT1
3 United SAT1
4 United SAT3
5 American SAT3
7 Delta SAT1
我在表达这一权利时遇到了问题,如果有任何编辑,我将不胜感激。
我有一个看起来像这样的 data frame
,显示各种卫星的运营商:
Operator Satellite
United SAT1
American SAT2
United SAT1
United SAT3
American SAT3
American SAT5
Delta SAT1
United SAT8
我正在尝试获得一个 data frame
给我的条目,对于由多个运营商运营的卫星,列出两个变量:
Operator Satellite
United SAT1
United SAT1
Delta SAT1
United SAT3
American SAT3
我能够解决这个问题的唯一方法是使用 which(file$operator)
循环,但这似乎是一种不必要的麻烦方法。
我将不胜感激任何帮助,并且对任何 packages
没有偏好。提前谢谢你。
这是一个data.table
单行方法
library( data.table )
DT <- fread("Operator Satellite
United SAT1
American SAT2
United SAT1
United SAT3
American SAT3
American SAT5
Delta SAT1
United SAT8")
DT[, if( .N > 1 ) .SD, by = Satellite]
# Satellite Operator
# 1: SAT1 United
# 2: SAT1 United
# 3: SAT1 Delta
# 4: SAT3 United
# 5: SAT3 American
你可以试试dplyr
解决办法:
library(dplyr)
df <- structure(list(Operator = c("United", "American", "United", "United",
"American", "American", "Delta", "United"), Satellite = c("SAT1",
"SAT2", "SAT1", "SAT3", "SAT3", "SAT5", "SAT1", "SAT8")), class = "data.frame", row.names = c(NA,
-8L))
#Code
df %>% left_join(df %>% group_by(Satellite) %>% summarise(N=n())) %>% filter(N>1)
Operator Satellite N
1 United SAT1 3
2 United SAT1 3
3 United SAT3 2
4 American SAT3 2
5 Delta SAT1 3
这是一种没有额外包的单行方法
dw <- read.table(header=T, text='
Operator Satellite
United SAT1
American SAT2
United SAT1
United SAT3
American SAT3
American SAT5
Delta SAT1
United SAT8
')
dw[ave(dw$Satellite, dw$Satellite, FUN = length) > 1,]
Operator Satellite
1 United SAT1
3 United SAT1
4 United SAT3
5 American SAT3
7 Delta SAT1