订购一个因素来制作聚集的ggplot热图但得到一个奇怪的错误
Ordering a factor to make a clustered ggplot heatmap but getting an odd error
我正在尝试制作聚类热图,如此处所述 Cluster data in heat map in R ggplot 并且 运行 遇到了一个令人费解的错误。
我可以制作一个非聚类距离热图,如下所示:
library(vegan)
library(tidyverse)
data(varespec)
library(reshape2)
library(viridis)
# Calculate a distance matrix
vare.dist <- vegdist(varespec)
# Cluster the distance matrix.
vare.hc <- hclust(as.dist(vare.dist))
# Process and melt the distance matrix
vare.dist.long <- vare.dist %>% as.matrix %>% melt %>%
mutate(Var1 = as.character(Var1), Var2 = as.character(Var2))
# Plot the heatmap
vare.dist.long %>% #as.matrix %>% .[vare.hc$order, vare.hc$order] %>% melt %>%
ggplot(aes(x = Var1, y = Var2, fill = value)) + geom_tile() + scale_fill_viridis(direction = 1) +
theme(axis.text.x = element_text(angle = 270, hjust = 0, vjust = 0.5
))
要对热图进行聚类,我需要将 vare.dist.long$Var1
和 vare.dist.long$Var2
转换为正确排序的因子。我认为我可以做到这一点
# Step 1: works without complaint
vare.dist.long1 <- vare.dist.long %>% mutate(Var1 = factor(Var1, levels = Var1[vare.hc$order]))
# Step 2: throws error
vare.dist.long2 <- vare.dist.long %>% mutate(Var2 = factor(Var2, levels = Var2[vare.hc$order]))
然后在绘图函数中用 vare.dist.long3
替换 vare.dist.long
。
奇怪的是,当我尝试对 Var2
(如在#Step 2
行)我收到以下错误:
Error in mutate_impl(.data, dots): Evaluation error: factor level [2] is duplicated.
Traceback:
1. vare.dist.long %>% mutate(Var2 = factor(Var2, levels = Var2[vare.hc$order]))
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. mutate(., Var2 = factor(Var2, levels = Var2[vare.hc$order]))
10. mutate.data.frame(., Var2 = factor(Var2, levels = Var2[vare.hc$order]))
11. as.data.frame(mutate(tbl_df(.data), ...))
12. mutate(tbl_df(.data), ...)
13. mutate.tbl_df(tbl_df(.data), ...)
14. mutate_impl(.data, dots)
我在这里错过了什么?为什么我不能改变 Var2
,据我所知,它与 Var1
几乎相同,但顺序不同?
提供给 levels
参数的向量不应有任何重复项。如果您在控制台中键入以下内容,您将看到您为 Var2
.
中的所有数字提供了相同的级别
vare.dist.long$Var2[vare.hc$order]
# [1] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
# [19] "18" "18" "18" "18" "18" "18"
我认为以下方法可行。 unique(Var1)
和 unique(Var2)
是为了确保没有重复。
vare.dist.long1 <- vare.dist.long %>% mutate(Var1 = factor(Var1, levels = unique(Var1)[vare.hc$order]))
vare.dist.long2 <- vare.dist.long %>% mutate(Var2 = factor(Var2, levels = unique(Var2)[vare.hc$order]))
我正在尝试制作聚类热图,如此处所述 Cluster data in heat map in R ggplot 并且 运行 遇到了一个令人费解的错误。
我可以制作一个非聚类距离热图,如下所示:
library(vegan)
library(tidyverse)
data(varespec)
library(reshape2)
library(viridis)
# Calculate a distance matrix
vare.dist <- vegdist(varespec)
# Cluster the distance matrix.
vare.hc <- hclust(as.dist(vare.dist))
# Process and melt the distance matrix
vare.dist.long <- vare.dist %>% as.matrix %>% melt %>%
mutate(Var1 = as.character(Var1), Var2 = as.character(Var2))
# Plot the heatmap
vare.dist.long %>% #as.matrix %>% .[vare.hc$order, vare.hc$order] %>% melt %>%
ggplot(aes(x = Var1, y = Var2, fill = value)) + geom_tile() + scale_fill_viridis(direction = 1) +
theme(axis.text.x = element_text(angle = 270, hjust = 0, vjust = 0.5
))
要对热图进行聚类,我需要将 vare.dist.long$Var1
和 vare.dist.long$Var2
转换为正确排序的因子。我认为我可以做到这一点
# Step 1: works without complaint
vare.dist.long1 <- vare.dist.long %>% mutate(Var1 = factor(Var1, levels = Var1[vare.hc$order]))
# Step 2: throws error
vare.dist.long2 <- vare.dist.long %>% mutate(Var2 = factor(Var2, levels = Var2[vare.hc$order]))
然后在绘图函数中用 vare.dist.long3
替换 vare.dist.long
。
奇怪的是,当我尝试对 Var2
(如在#Step 2
行)我收到以下错误:
Error in mutate_impl(.data, dots): Evaluation error: factor level [2] is duplicated. Traceback: 1. vare.dist.long %>% mutate(Var2 = factor(Var2, levels = Var2[vare.hc$order])) 2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)) 3. eval(quote(`_fseq`(`_lhs`)), env, env) 4. eval(quote(`_fseq`(`_lhs`)), env, env) 5. `_fseq`(`_lhs`) 6. freduce(value, `_function_list`) 7. withVisible(function_list[[k]](value)) 8. function_list[[k]](value) 9. mutate(., Var2 = factor(Var2, levels = Var2[vare.hc$order])) 10. mutate.data.frame(., Var2 = factor(Var2, levels = Var2[vare.hc$order])) 11. as.data.frame(mutate(tbl_df(.data), ...)) 12. mutate(tbl_df(.data), ...) 13. mutate.tbl_df(tbl_df(.data), ...) 14. mutate_impl(.data, dots)
我在这里错过了什么?为什么我不能改变 Var2
,据我所知,它与 Var1
几乎相同,但顺序不同?
提供给 levels
参数的向量不应有任何重复项。如果您在控制台中键入以下内容,您将看到您为 Var2
.
vare.dist.long$Var2[vare.hc$order]
# [1] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
# [19] "18" "18" "18" "18" "18" "18"
我认为以下方法可行。 unique(Var1)
和 unique(Var2)
是为了确保没有重复。
vare.dist.long1 <- vare.dist.long %>% mutate(Var1 = factor(Var1, levels = unique(Var1)[vare.hc$order]))
vare.dist.long2 <- vare.dist.long %>% mutate(Var2 = factor(Var2, levels = unique(Var2)[vare.hc$order]))