使用 dplyr 根据每个组中唯一外观的总数给出一个 ID
Give an ID based on the total number of unique appearances in each group using dplyr
我一直在努力解决这个问题,希望得到您的指导和帮助
我有一个 data.frame 看起来像这样
col1 <- c("a","a","b", "a","b","c","a","c","d")
replicate <- c("rep1","rep1","rep1","rep2","rep2","rep2","rep3","rep3","rep3")
df = data.frame(col1, replicate)
col1 replicate
1 a rep1
2 a rep1
3 b rep1
4 a rep2
5 b rep2
6 c rep2
7 a rep3
8 c rep3
9 d rep3
我想创建另一个包含每个元素出现次数的列
col1 出现在 replicate 列中,但我不想考虑每个复制中的重复项。我希望我的数据看起来像这样
col1 replicate ID
1 a rep1 3
2 a rep1 3
3 b rep1 2
4 a rep2 3
5 b rep2 2
6 c rep2 2
7 a rep3 3
8 c rep3 2
9 d rep3 1
这是因为“a”出现在所有 3 个重复中
“b”存在于 rep1 和 rep2 中
rep2 和 rep3 中的“c”
而“d”仅在 rep3
df %>% group_by(col1) %>%
mutate(ID = n_distinct(col1, replicate))
# A tibble: 9 x 3
# Groups: col1 [4]
col1 replicate ID
<chr> <chr> <int>
1 a rep1 3
2 a rep1 3
3 b rep1 2
4 a rep2 3
5 b rep2 2
6 c rep2 2
7 a rep3 3
8 c rep3 2
9 d rep3 1
使用uniqueN
library(data.table)
setDT(df)[, ID := uniqueN(paste(col1, replicate)), col1]
-输出
df
col1 replicate ID
1: a rep1 3
2: a rep1 3
3: b rep1 2
4: a rep2 3
5: b rep2 2
6: c rep2 2
7: a rep3 3
8: c rep3 2
9: d rep3 1
我一直在努力解决这个问题,希望得到您的指导和帮助 我有一个 data.frame 看起来像这样
col1 <- c("a","a","b", "a","b","c","a","c","d")
replicate <- c("rep1","rep1","rep1","rep2","rep2","rep2","rep3","rep3","rep3")
df = data.frame(col1, replicate)
col1 replicate
1 a rep1
2 a rep1
3 b rep1
4 a rep2
5 b rep2
6 c rep2
7 a rep3
8 c rep3
9 d rep3
我想创建另一个包含每个元素出现次数的列 col1 出现在 replicate 列中,但我不想考虑每个复制中的重复项。我希望我的数据看起来像这样
col1 replicate ID
1 a rep1 3
2 a rep1 3
3 b rep1 2
4 a rep2 3
5 b rep2 2
6 c rep2 2
7 a rep3 3
8 c rep3 2
9 d rep3 1
这是因为“a”出现在所有 3 个重复中 “b”存在于 rep1 和 rep2 中 rep2 和 rep3 中的“c” 而“d”仅在 rep3
df %>% group_by(col1) %>%
mutate(ID = n_distinct(col1, replicate))
# A tibble: 9 x 3
# Groups: col1 [4]
col1 replicate ID
<chr> <chr> <int>
1 a rep1 3
2 a rep1 3
3 b rep1 2
4 a rep2 3
5 b rep2 2
6 c rep2 2
7 a rep3 3
8 c rep3 2
9 d rep3 1
使用uniqueN
library(data.table)
setDT(df)[, ID := uniqueN(paste(col1, replicate)), col1]
-输出
df
col1 replicate ID
1: a rep1 3
2: a rep1 3
3: b rep1 2
4: a rep2 3
5: b rep2 2
6: c rep2 2
7: a rep3 3
8: c rep3 2
9: d rep3 1