如何使用 dplyr 按组减去值(减去作为一组存储的空白)?

How to subtract values by group (subtract blank stored as one group) using dplyr?

我整理了一些数据,其中一组是空白的:

df <- data.frame(Group = c(rep(LETTERS[1:3], 3), "Blank", "Blank", "Blank"), 
                 ID = rep(1:3, 4),
                 Value = c(10, 11, 12, 21, 22, 23, 31, 32, 33, 1, 2, 3))    
df
   Group ID Value
1      A  1    10
2      B  2    11
3      C  3    12
4      A  1    21
5      B  2    22
6      C  3    23
7      A  1    31
8      B  2    32
9      C  3    33
10 Blank  1     1
11 Blank  2     2
12 Blank  3     3

我想从每个组(A、B、C)中减去 Blank,因此标准化数据将如下所示:

df_normalized<- data.frame(Group = rep(LETTERS[1:3], 3),
             ID = rep(1:3, 3),
             Value = c(9, 9, 9, 20, 20, 20, 30, 30, 30))

df_normalized
  Group ID Value
1     A  1     9
2     B  2     9
3     C  3     9
4     A  1    20
5     B  2    20
6     C  3    20
7     A  1    30
8     B  2    30
9     C  3    30

如何使用dplyr很好地做到这一点?

编辑: 如何为多个组做到这一点?例如:

df <- data.frame(Cluster = c(rep("C1", 12), rep("C2", 12)),
                 Group = rep(c(rep(LETTERS[1:3], 3), "Blank", "Blank", "Blank"), 2), 
                 ID = rep(1:3, 8),
                 Value = sample(24))

假设每个 ID 只有一个 "Blank" 值,如示例所示,您可以

library(dplyr)

df %>%
  group_by(ID) %>%
  mutate(Value = Value - Value[Group == "Blank"])  %>%
  filter(Group != "Blank")

#  Group    ID Value
#  <fct> <int> <dbl>
#1 A         1     9
#2 B         2     9
#3 C         3     9
#4 A         1    20
#5 B         2    20
#6 C         3    20
#7 A         1    30
#8 B         2    30
#9 C         3    30

如果您有多个 "Blank",您可以使用 match,这将确保只选择第一个值。

df %>%
  group_by(ID) %>%
  mutate(Value = Value - Value[match("Blank", Group)])  %>%
  filter(Group != "Blank")