在 R 中每行应用函数
Apply function per row in R
我有一个要应用的数据集 shapiro.test
。
test_data <- tibble(
gene = rep(c(LETTERS[1:5]), times = 2, each = 5),
treatment = c(rep('control', times = 25) , rep('treatment', times = 25)),
day = rep(c(1:2), times = 5, each = 5),
data = rnorm(50, mean = 25, sd = 5)
)
# A tibble: 50 x 4
gene treatment day data
<chr> <chr> <int> <dbl>
1 A control 1 28.8
2 A control 1 22.4
3 A control 1 24.8
4 A control 1 20.1
5 A control 1 15.6
6 B control 2 26.5
7 B control 2 26.2
8 B control 2 25.3
9 B control 2 21.4
10 B control 2 35.0
# … with 40 more rows
我创建了一个函数来 运行 测试每个 gene、treatment 和 day:
normality_test <- function(x, y, z){
with(test_data, shapiro.test(data[gene == x & treatment == y & day == z]))
}
所以,如果我 运行 normality_test('A', 'control', '1')
它将在 对照 中测试 基因 A 第 1 天.
Shapiro-Wilk normality test
data: data[gene == x & treatment == y & day == z]
W = 0.99935, p-value = 0.9998
但是,我希望函数遍历 gene/treatment/day 的所有组合并单独输出每个正态性测试,但一直无法弄清楚。
我已经能够创建一个循环,将每一行输出为单独的 tibble
,但未能成功分离行中的每个元素以添加到 normality_test
函数。
我还尝试了 map
和 lmap
,但无济于事。
谢谢。
library(dplyr)
test_data %>%
nest_by(gene, treatment, day, .key = "nested") %>%
mutate(sw = list(shapiro.test(nested$data))) %>%
pull(sw)
输出
您的示例数据中有 10 个 gene
、treatment
和 day
的独特组合。
显示前几个元素:
[[1]]
Shapiro-Wilk normality test
data: nested$data
W = 0.88041, p-value = 0.3112
[[2]]
Shapiro-Wilk normality test
data: nested$data
W = 0.96533, p-value = 0.8445
如果您不通过管道传输到 pull
,那么您的输出将是:
gene treatment day nested sw
<chr> <chr> <int> <list<tibble[,1]>> <list>
1 A control 1 [5 x 1] <htest>
2 A treatment 2 [5 x 1] <htest>
3 B control 2 [5 x 1] <htest>
4 B treatment 1 [5 x 1] <htest>
5 C control 1 [5 x 1] <htest>
6 C treatment 2 [5 x 1] <htest>
7 D control 2 [5 x 1] <htest>
8 D treatment 1 [5 x 1] <htest>
9 E control 1 [5 x 1] <htest>
10 E treatment 2 [5 x 1] <htest>
第 sw
列包含测试结果。
或者仅针对 p 值:
test_data %>%
nest_by(gene, treatment, day, .key = "nested") %>%
mutate(sw = shapiro.test(nested$data)$p.value)
输出
gene treatment day nested sw
<chr> <chr> <int> <list<tibble[,1]>> <dbl>
1 A control 1 [5 x 1] 0.311
2 A treatment 2 [5 x 1] 0.845
3 B control 2 [5 x 1] 0.408
4 B treatment 1 [5 x 1] 0.204
5 C control 1 [5 x 1] 0.435
6 C treatment 2 [5 x 1] 0.316
7 D control 2 [5 x 1] 0.143
8 D treatment 1 [5 x 1] 0.236
9 E control 1 [5 x 1] 0.695
10 E treatment 2 [5 x 1] 0.658
aggregate(test_data$data,
list(test_data$gene, test_data$treatment, test_data$day),
FUN = shapiro.test,
simplify = FALSE)
导致:
Group.1 Group.2 Group.3 x
1 A control 1 0.88967912801297, 0.355492282929607, Shapiro-Wilk normality test, X[[i]]
2 C control 1 0.872622601121686, 0.277206156703845, Shapiro-Wilk normality test, X[[i]]
3 E control 1 0.886337216320072, 0.339020498506711, Shapiro-Wilk normality test, X[[i]]
4 B treatment 1 0.902918723585913, 0.426227078990388, Shapiro-Wilk normality test, X[[i]]
5 D treatment 1 0.850079117181635, 0.194768724952993, Shapiro-Wilk normality test, X[[i]]
6 B control 2 0.810238329506977, 0.0979423168617567, Shapiro-Wilk normality test, X[[i]]
7 D control 2 0.965126339172019, 0.843147936412715, Shapiro-Wilk normality test, X[[i]]
8 A treatment 2 0.933316692276928, 0.619155973369246, Shapiro-Wilk normality test, X[[i]]
9 C treatment 2 0.771672137756979, 0.0466697848359151, Shapiro-Wilk normality test, X[[i]]
10 E treatment 2 0.91332590966644, 0.487832813430558, Shapiro-Wilk normality test, X[[i]]
我有一个要应用的数据集 shapiro.test
。
test_data <- tibble(
gene = rep(c(LETTERS[1:5]), times = 2, each = 5),
treatment = c(rep('control', times = 25) , rep('treatment', times = 25)),
day = rep(c(1:2), times = 5, each = 5),
data = rnorm(50, mean = 25, sd = 5)
)
# A tibble: 50 x 4
gene treatment day data
<chr> <chr> <int> <dbl>
1 A control 1 28.8
2 A control 1 22.4
3 A control 1 24.8
4 A control 1 20.1
5 A control 1 15.6
6 B control 2 26.5
7 B control 2 26.2
8 B control 2 25.3
9 B control 2 21.4
10 B control 2 35.0
# … with 40 more rows
我创建了一个函数来 运行 测试每个 gene、treatment 和 day:
normality_test <- function(x, y, z){
with(test_data, shapiro.test(data[gene == x & treatment == y & day == z]))
}
所以,如果我 运行 normality_test('A', 'control', '1')
它将在 对照 中测试 基因 A 第 1 天.
Shapiro-Wilk normality test
data: data[gene == x & treatment == y & day == z]
W = 0.99935, p-value = 0.9998
但是,我希望函数遍历 gene/treatment/day 的所有组合并单独输出每个正态性测试,但一直无法弄清楚。
我已经能够创建一个循环,将每一行输出为单独的 tibble
,但未能成功分离行中的每个元素以添加到 normality_test
函数。
我还尝试了 map
和 lmap
,但无济于事。
谢谢。
library(dplyr)
test_data %>%
nest_by(gene, treatment, day, .key = "nested") %>%
mutate(sw = list(shapiro.test(nested$data))) %>%
pull(sw)
输出
您的示例数据中有 10 个 gene
、treatment
和 day
的独特组合。
显示前几个元素:
[[1]]
Shapiro-Wilk normality test
data: nested$data
W = 0.88041, p-value = 0.3112
[[2]]
Shapiro-Wilk normality test
data: nested$data
W = 0.96533, p-value = 0.8445
如果您不通过管道传输到 pull
,那么您的输出将是:
gene treatment day nested sw
<chr> <chr> <int> <list<tibble[,1]>> <list>
1 A control 1 [5 x 1] <htest>
2 A treatment 2 [5 x 1] <htest>
3 B control 2 [5 x 1] <htest>
4 B treatment 1 [5 x 1] <htest>
5 C control 1 [5 x 1] <htest>
6 C treatment 2 [5 x 1] <htest>
7 D control 2 [5 x 1] <htest>
8 D treatment 1 [5 x 1] <htest>
9 E control 1 [5 x 1] <htest>
10 E treatment 2 [5 x 1] <htest>
第 sw
列包含测试结果。
或者仅针对 p 值:
test_data %>%
nest_by(gene, treatment, day, .key = "nested") %>%
mutate(sw = shapiro.test(nested$data)$p.value)
输出
gene treatment day nested sw
<chr> <chr> <int> <list<tibble[,1]>> <dbl>
1 A control 1 [5 x 1] 0.311
2 A treatment 2 [5 x 1] 0.845
3 B control 2 [5 x 1] 0.408
4 B treatment 1 [5 x 1] 0.204
5 C control 1 [5 x 1] 0.435
6 C treatment 2 [5 x 1] 0.316
7 D control 2 [5 x 1] 0.143
8 D treatment 1 [5 x 1] 0.236
9 E control 1 [5 x 1] 0.695
10 E treatment 2 [5 x 1] 0.658
aggregate(test_data$data,
list(test_data$gene, test_data$treatment, test_data$day),
FUN = shapiro.test,
simplify = FALSE)
导致:
Group.1 Group.2 Group.3 x
1 A control 1 0.88967912801297, 0.355492282929607, Shapiro-Wilk normality test, X[[i]]
2 C control 1 0.872622601121686, 0.277206156703845, Shapiro-Wilk normality test, X[[i]]
3 E control 1 0.886337216320072, 0.339020498506711, Shapiro-Wilk normality test, X[[i]]
4 B treatment 1 0.902918723585913, 0.426227078990388, Shapiro-Wilk normality test, X[[i]]
5 D treatment 1 0.850079117181635, 0.194768724952993, Shapiro-Wilk normality test, X[[i]]
6 B control 2 0.810238329506977, 0.0979423168617567, Shapiro-Wilk normality test, X[[i]]
7 D control 2 0.965126339172019, 0.843147936412715, Shapiro-Wilk normality test, X[[i]]
8 A treatment 2 0.933316692276928, 0.619155973369246, Shapiro-Wilk normality test, X[[i]]
9 C treatment 2 0.771672137756979, 0.0466697848359151, Shapiro-Wilk normality test, X[[i]]
10 E treatment 2 0.91332590966644, 0.487832813430558, Shapiro-Wilk normality test, X[[i]]