过滤列中的值并删除不符合条件的行
filter on a value in a column and remove rows that dont meet condition
我是 R 的新手。我有这个问题。我有狗的数据 example of useful part of data(使用了列 age_month 和 rasnaam(品种)
我必须查找所有品种,如果它们是小型、中型、大型等。如果它们是小型品种,则必须删除 age_month 小于 9 的所有行,如果它们是中等大小的品种行,其中 age_month 低于 13 必须被删除,(大,age_month < 24)。
我已经尝试了一些东西,但它不会工作。
我已经将所有狗添加到列表中(也尝试使用矢量),如下所示:(此处仅适用于小型犬)
small_dogs <- list("Affenpinscher", "Bichon frisé", "Bolognezer", "Chihuahua, langhaar",
"Dandie Dinmont Terrier", "Dwergkeeshond", "Japanse Spaniel",
"Griffon belge", "Griffon bruxellois", "Kleine Keeshond",
"Lhasa Apso", "Maltezer", "Mopshond", "Pekingees", "Petit Brabançon",
"Shih Tzu", "Tibetaanse Spaniel", "Volpino Italiano", "Yorkshire Terrier")
我试过这个:
for (i in 1:nrow(brachquest2)){
ifelse((brachquest2$rasnaam %in% small_dogs), (brachquest2 <- brachquest2[!(brachquest2$age_month < 9), ]),
ifelse((brachquest2$rasnaam %in% medium_dogs)), (brachquest2 <- brachquest2[!(brachquest2$age_month < 13), ]),
(brachquest2 <- brachquest2[!(brachquest2$age_month < 24), ]))
}
但是我得到一个未使用的参数错误。
然后我尝试使用case_when(),但是我不熟悉这个函数,所以我可能用错了:
brachquest2 <- case_when(
brachquest2$rasnaam %in% small_dogs ~ brachquest2[!(brachquest2$age_month < 11), ],
brachquest2$rasnaam %in% medium_dogs ~ brachquest2[!(brachquest2$age_month < 13), ]
)
然后我得到一个错误:长度必须是 66 或 1,而不是 18。
(行数为66)
希望我解释的没问题。
有人对我有一些有用的提示吗?或者可能会更简单,感谢您的帮助!!
提前致谢
以下是 age_month 和 rasnaam 对 neilfws 的反应。我不确定这是不是正确的方法
structure(list(age_month = structure(c(50, 52, 52.1, 49.7, 49.7,
49.6, 49.6, 49.6, 49.5, 50, 48.8, 52.1, 51.9, 48.7, 50, 50.2,
50.4, 50.5, 49, 49, 49, 49, 49, 48.9, 15, 17.6, 17.6, 17.6, 17.6,
16.3, 17.6, 17.6, 15, 15.8, 16, 16.2, 17.5, 14.9, 10.4, 10.2,
10.5, 10.4, 10.3, 10.3, 10.2, 10.3, 10.3, 10.3, 12.8, 12.8, 12.8,
12.8, 12.8, 10, 10.4, 10.2, 10.3, 10.3, 12.7, 12.7, 13.2, 13.2,
13.1, 13.1, 12.7, 12.7), units = "days", class = "difftime"),
rasnaam = c("American Staffordshire Terrier", "Boxer", "Bull Terrier",
"Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Chihuahua, langhaar", "Franse Bulldog", "Franse Bulldog",
"Labrador Retriever", "Shih Tzu", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"Boxer", "Boxer", "Boxer", "Boxer", "Boxer", "Bull Terrier",
"Bull Terrier", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Franse Bulldog", "Franse Bulldog", "Franse Bulldog", "Franse Bulldog",
"Franse Bulldog", "Labrador Retriever", "Labrador Retriever",
"Labrador Retriever", "Labrador Retriever", "Labrador Retriever",
"Labrador Retriever", "Labrador Retriever", "Labrador Retriever",
"Labrador Retriever", "Labrador Retriever", "Labrador Retriever",
"Shih Tzu", "Shih Tzu", "Shih Tzu", "Shih Tzu", "Shih Tzu",
"American Staffordshire Terrier", "Boxer", "Franse Bulldog",
"Franse Bulldog", "Shih Tzu", "Shih Tzu", "American Staffordshire Terrier",
"Boxer")), row.names = c(NA, -66L), class = "data.frame")
如果您想坚持使用 case_when
,这是实现您所寻找的目标的一种方法:
library(dplyr)
brachquest2 %>%
mutate(
# Create a temp var, removal_status, to label what rows should be kept or removed
removal_status = case_when(
(rasnaam %in% small_dogs) & age_month < 9 ~ "Remove",
(rasnaam %in% medium_dogs) & age_month < 13 ~ "Remove",
(rasnaam %in% large_dogs) & age_month < 24 ~ "Remove",
TRUE ~ "Keep"
)) %>%
# Keep only what's labelled "Keep"
filter(removal_status == "Keep") %>%
# Remove temp var
select(-removal_status)
使用您提供的 small_dogs
列表并创建我自己的 medium_dogs
列表,其中只有一个值,拳击手,我得到以下内容(age_month 下有 2 个拳击手删除了 13 个):
# age_month rasnaam
# 1 50.0 days American Staffordshire Terrier
# 2 52.0 days Boxer
# 3 52.1 days Bull Terrier
# 4 49.7 days Chihuahua, langhaar
# 5 49.7 days Chihuahua, langhaar
# 6 49.6 days Chihuahua, langhaar
# 7 49.6 days Chihuahua, langhaar
# 8 49.6 days Chihuahua, langhaar
# 9 49.5 days Chihuahua, langhaar
# 10 50.0 days Chihuahua, langhaar
# 11 48.8 days Franse Bulldog
# 12 52.1 days Franse Bulldog
# 13 51.9 days Labrador Retriever
# 14 48.7 days Shih Tzu
# 15 50.0 days American Staffordshire Terrier
# 16 50.2 days American Staffordshire Terrier
# 17 50.4 days American Staffordshire Terrier
# 18 50.5 days American Staffordshire Terrier
# 19 49.0 days American Staffordshire Terrier
# 20 49.0 days American Staffordshire Terrier
# 21 49.0 days American Staffordshire Terrier
# 22 49.0 days American Staffordshire Terrier
# 23 49.0 days American Staffordshire Terrier
# 24 48.9 days American Staffordshire Terrier
# 25 15.0 days American Staffordshire Terrier
# 26 17.6 days Boxer
# 27 17.6 days Boxer
# 28 17.6 days Boxer
# 29 17.6 days Boxer
# 30 16.3 days Boxer
# 31 17.6 days Bull Terrier
# 32 17.6 days Bull Terrier
# 33 15.0 days Chihuahua, langhaar
# 34 15.8 days Chihuahua, langhaar
# 35 16.0 days Chihuahua, langhaar
# 36 16.2 days Chihuahua, langhaar
# 37 17.5 days Chihuahua, langhaar
# 38 14.9 days Franse Bulldog
# 39 10.4 days Franse Bulldog
# 40 10.2 days Franse Bulldog
# 41 10.5 days Franse Bulldog
# 42 10.4 days Franse Bulldog
# 43 10.3 days Labrador Retriever
# 44 10.3 days Labrador Retriever
# 45 10.2 days Labrador Retriever
# 46 10.3 days Labrador Retriever
# 47 10.3 days Labrador Retriever
# 48 10.3 days Labrador Retriever
# 49 12.8 days Labrador Retriever
# 50 12.8 days Labrador Retriever
# 51 12.8 days Labrador Retriever
# 52 12.8 days Labrador Retriever
# 53 12.8 days Labrador Retriever
# 54 10.0 days Shih Tzu
# 55 10.4 days Shih Tzu
# 56 10.2 days Shih Tzu
# 57 10.3 days Shih Tzu
# 58 10.3 days Shih Tzu
# 59 12.7 days American Staffordshire Terrier
# 60 13.2 days Franse Bulldog
# 61 13.2 days Franse Bulldog
# 62 13.1 days Shih Tzu
# 63 13.1 days Shih Tzu
# 64 12.7 days American Staffordshire Terrier
根据需要调整列表和 age_month 条件。
我是 R 的新手。我有这个问题。我有狗的数据 example of useful part of data(使用了列 age_month 和 rasnaam(品种)
我必须查找所有品种,如果它们是小型、中型、大型等。如果它们是小型品种,则必须删除 age_month 小于 9 的所有行,如果它们是中等大小的品种行,其中 age_month 低于 13 必须被删除,(大,age_month < 24)。 我已经尝试了一些东西,但它不会工作。 我已经将所有狗添加到列表中(也尝试使用矢量),如下所示:(此处仅适用于小型犬)
small_dogs <- list("Affenpinscher", "Bichon frisé", "Bolognezer", "Chihuahua, langhaar",
"Dandie Dinmont Terrier", "Dwergkeeshond", "Japanse Spaniel",
"Griffon belge", "Griffon bruxellois", "Kleine Keeshond",
"Lhasa Apso", "Maltezer", "Mopshond", "Pekingees", "Petit Brabançon",
"Shih Tzu", "Tibetaanse Spaniel", "Volpino Italiano", "Yorkshire Terrier")
我试过这个:
for (i in 1:nrow(brachquest2)){
ifelse((brachquest2$rasnaam %in% small_dogs), (brachquest2 <- brachquest2[!(brachquest2$age_month < 9), ]),
ifelse((brachquest2$rasnaam %in% medium_dogs)), (brachquest2 <- brachquest2[!(brachquest2$age_month < 13), ]),
(brachquest2 <- brachquest2[!(brachquest2$age_month < 24), ]))
}
但是我得到一个未使用的参数错误。 然后我尝试使用case_when(),但是我不熟悉这个函数,所以我可能用错了:
brachquest2 <- case_when(
brachquest2$rasnaam %in% small_dogs ~ brachquest2[!(brachquest2$age_month < 11), ],
brachquest2$rasnaam %in% medium_dogs ~ brachquest2[!(brachquest2$age_month < 13), ]
)
然后我得到一个错误:长度必须是 66 或 1,而不是 18。
(行数为66)
希望我解释的没问题。 有人对我有一些有用的提示吗?或者可能会更简单,感谢您的帮助!! 提前致谢
以下是 age_month 和 rasnaam 对 neilfws 的反应。我不确定这是不是正确的方法
structure(list(age_month = structure(c(50, 52, 52.1, 49.7, 49.7,
49.6, 49.6, 49.6, 49.5, 50, 48.8, 52.1, 51.9, 48.7, 50, 50.2,
50.4, 50.5, 49, 49, 49, 49, 49, 48.9, 15, 17.6, 17.6, 17.6, 17.6,
16.3, 17.6, 17.6, 15, 15.8, 16, 16.2, 17.5, 14.9, 10.4, 10.2,
10.5, 10.4, 10.3, 10.3, 10.2, 10.3, 10.3, 10.3, 12.8, 12.8, 12.8,
12.8, 12.8, 10, 10.4, 10.2, 10.3, 10.3, 12.7, 12.7, 13.2, 13.2,
13.1, 13.1, 12.7, 12.7), units = "days", class = "difftime"),
rasnaam = c("American Staffordshire Terrier", "Boxer", "Bull Terrier",
"Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Chihuahua, langhaar", "Franse Bulldog", "Franse Bulldog",
"Labrador Retriever", "Shih Tzu", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"American Staffordshire Terrier", "American Staffordshire Terrier",
"Boxer", "Boxer", "Boxer", "Boxer", "Boxer", "Bull Terrier",
"Bull Terrier", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Chihuahua, langhaar", "Chihuahua, langhaar", "Chihuahua, langhaar",
"Franse Bulldog", "Franse Bulldog", "Franse Bulldog", "Franse Bulldog",
"Franse Bulldog", "Labrador Retriever", "Labrador Retriever",
"Labrador Retriever", "Labrador Retriever", "Labrador Retriever",
"Labrador Retriever", "Labrador Retriever", "Labrador Retriever",
"Labrador Retriever", "Labrador Retriever", "Labrador Retriever",
"Shih Tzu", "Shih Tzu", "Shih Tzu", "Shih Tzu", "Shih Tzu",
"American Staffordshire Terrier", "Boxer", "Franse Bulldog",
"Franse Bulldog", "Shih Tzu", "Shih Tzu", "American Staffordshire Terrier",
"Boxer")), row.names = c(NA, -66L), class = "data.frame")
如果您想坚持使用 case_when
,这是实现您所寻找的目标的一种方法:
library(dplyr)
brachquest2 %>%
mutate(
# Create a temp var, removal_status, to label what rows should be kept or removed
removal_status = case_when(
(rasnaam %in% small_dogs) & age_month < 9 ~ "Remove",
(rasnaam %in% medium_dogs) & age_month < 13 ~ "Remove",
(rasnaam %in% large_dogs) & age_month < 24 ~ "Remove",
TRUE ~ "Keep"
)) %>%
# Keep only what's labelled "Keep"
filter(removal_status == "Keep") %>%
# Remove temp var
select(-removal_status)
使用您提供的 small_dogs
列表并创建我自己的 medium_dogs
列表,其中只有一个值,拳击手,我得到以下内容(age_month 下有 2 个拳击手删除了 13 个):
# age_month rasnaam
# 1 50.0 days American Staffordshire Terrier
# 2 52.0 days Boxer
# 3 52.1 days Bull Terrier
# 4 49.7 days Chihuahua, langhaar
# 5 49.7 days Chihuahua, langhaar
# 6 49.6 days Chihuahua, langhaar
# 7 49.6 days Chihuahua, langhaar
# 8 49.6 days Chihuahua, langhaar
# 9 49.5 days Chihuahua, langhaar
# 10 50.0 days Chihuahua, langhaar
# 11 48.8 days Franse Bulldog
# 12 52.1 days Franse Bulldog
# 13 51.9 days Labrador Retriever
# 14 48.7 days Shih Tzu
# 15 50.0 days American Staffordshire Terrier
# 16 50.2 days American Staffordshire Terrier
# 17 50.4 days American Staffordshire Terrier
# 18 50.5 days American Staffordshire Terrier
# 19 49.0 days American Staffordshire Terrier
# 20 49.0 days American Staffordshire Terrier
# 21 49.0 days American Staffordshire Terrier
# 22 49.0 days American Staffordshire Terrier
# 23 49.0 days American Staffordshire Terrier
# 24 48.9 days American Staffordshire Terrier
# 25 15.0 days American Staffordshire Terrier
# 26 17.6 days Boxer
# 27 17.6 days Boxer
# 28 17.6 days Boxer
# 29 17.6 days Boxer
# 30 16.3 days Boxer
# 31 17.6 days Bull Terrier
# 32 17.6 days Bull Terrier
# 33 15.0 days Chihuahua, langhaar
# 34 15.8 days Chihuahua, langhaar
# 35 16.0 days Chihuahua, langhaar
# 36 16.2 days Chihuahua, langhaar
# 37 17.5 days Chihuahua, langhaar
# 38 14.9 days Franse Bulldog
# 39 10.4 days Franse Bulldog
# 40 10.2 days Franse Bulldog
# 41 10.5 days Franse Bulldog
# 42 10.4 days Franse Bulldog
# 43 10.3 days Labrador Retriever
# 44 10.3 days Labrador Retriever
# 45 10.2 days Labrador Retriever
# 46 10.3 days Labrador Retriever
# 47 10.3 days Labrador Retriever
# 48 10.3 days Labrador Retriever
# 49 12.8 days Labrador Retriever
# 50 12.8 days Labrador Retriever
# 51 12.8 days Labrador Retriever
# 52 12.8 days Labrador Retriever
# 53 12.8 days Labrador Retriever
# 54 10.0 days Shih Tzu
# 55 10.4 days Shih Tzu
# 56 10.2 days Shih Tzu
# 57 10.3 days Shih Tzu
# 58 10.3 days Shih Tzu
# 59 12.7 days American Staffordshire Terrier
# 60 13.2 days Franse Bulldog
# 61 13.2 days Franse Bulldog
# 62 13.1 days Shih Tzu
# 63 13.1 days Shih Tzu
# 64 12.7 days American Staffordshire Terrier
根据需要调整列表和 age_month 条件。