dplyr/tidyverse 函数中的条件语句排除同一因子水平之间的比较
Conditional statement in dplyr/tidyverse function to exclude comparisons among same levels of a factor
我有一个这样的数据框:
data = read.table(text = "region plot species
1 1A A_B
1 1A A_B
1 1B B_C
1 1C A_B
1 1D C_D
2 2A B_C
2 2A B_C
2 2A E_F
2 2B B_C
2 2B E_F
2 2C E_F
2 2D B_C
3 3A A_B
3 3B A_B", stringsAsFactors = FALSE, header = TRUE)
我想比较 plot
的每个级别,以获得两次情节比较中唯一 species
匹配的计数。但是,我不想在相同的地块之间进行比较(即 remove/do 不包括 1A_1A 或 1B_1B 或 2C_2C,等等)。此示例的输出应如下所示:
output<-
region plot freq
1 1A_1B 0
1 1A_1C 1
1 1A_1D 0
1 1B_1C 0
1 1B_1D 0
1 1C_1D 0
2 2A_2B 2
2 2A_2C 1
2 2A_2D 1
2 2B_2C 1
2 2B_2D 1
2 2C_2D 0
3 3A_3B 1
我改编了@HubertL 的以下代码,
但很难合并适当的 if else 语句来满足此条件:
library(tidyverse)
data %>% group_by(region, species) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(ifelse(plot[i] = plot[i], freq =
length(unique((species),)
您可以通过添加 filter(!duplicated(plot))
:
来过滤掉重复项
data %>% group_by(region, species) %>%
filter(!duplicated(plot)) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(freq=n())
region y freq
<int> <chr> <int>
1 1 1A_1C 1
2 2 2A_2B 2
3 2 2A_2C 1
4 2 2A_2D 1
5 2 2B_2C 1
6 2 2B_2D 1
7 3 3A_3B 1
我有一个这样的数据框:
data = read.table(text = "region plot species
1 1A A_B
1 1A A_B
1 1B B_C
1 1C A_B
1 1D C_D
2 2A B_C
2 2A B_C
2 2A E_F
2 2B B_C
2 2B E_F
2 2C E_F
2 2D B_C
3 3A A_B
3 3B A_B", stringsAsFactors = FALSE, header = TRUE)
我想比较 plot
的每个级别,以获得两次情节比较中唯一 species
匹配的计数。但是,我不想在相同的地块之间进行比较(即 remove/do 不包括 1A_1A 或 1B_1B 或 2C_2C,等等)。此示例的输出应如下所示:
output<-
region plot freq
1 1A_1B 0
1 1A_1C 1
1 1A_1D 0
1 1B_1C 0
1 1B_1D 0
1 1C_1D 0
2 2A_2B 2
2 2A_2C 1
2 2A_2D 1
2 2B_2C 1
2 2B_2D 1
2 2C_2D 0
3 3A_3B 1
我改编了@HubertL 的以下代码,
library(tidyverse)
data %>% group_by(region, species) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(ifelse(plot[i] = plot[i], freq =
length(unique((species),)
您可以通过添加 filter(!duplicated(plot))
:
data %>% group_by(region, species) %>%
filter(!duplicated(plot)) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(freq=n())
region y freq
<int> <chr> <int>
1 1 1A_1C 1
2 2 2A_2B 2
3 2 2A_2C 1
4 2 2A_2D 1
5 2 2B_2C 1
6 2 2B_2D 1
7 3 3A_3B 1