订购一个因素来制作聚集的ggplot热图但得到一个奇怪的错误

Ordering a factor to make a clustered ggplot heatmap but getting an odd error

我正在尝试制作聚类热图,如此处所述 Cluster data in heat map in R ggplot 并且 运行 遇到了一个令人费解的错误。

我可以制作一个非聚类距离热图,如下所示:

library(vegan)
library(tidyverse)
data(varespec)
library(reshape2)
library(viridis)

# Calculate a distance matrix
vare.dist <- vegdist(varespec)

# Cluster the distance matrix.
vare.hc <- hclust(as.dist(vare.dist))

# Process and melt the distance matrix
vare.dist.long <- vare.dist %>% as.matrix %>% melt %>%
mutate(Var1 = as.character(Var1), Var2 = as.character(Var2))

# Plot the heatmap
vare.dist.long %>% #as.matrix %>% .[vare.hc$order, vare.hc$order] %>% melt %>%
ggplot(aes(x = Var1, y = Var2, fill = value)) + geom_tile() + scale_fill_viridis(direction = 1) +
theme(axis.text.x = element_text(angle = 270, hjust = 0, vjust = 0.5
                                ))

要对热图进行聚类,我需要将 vare.dist.long$Var1vare.dist.long$Var2 转换为正确排序的因子。我认为我可以做到这一点

# Step 1: works without complaint
vare.dist.long1 <- vare.dist.long %>% mutate(Var1 = factor(Var1, levels = Var1[vare.hc$order]))
# Step 2: throws error
vare.dist.long2 <- vare.dist.long %>% mutate(Var2 = factor(Var2, levels = Var2[vare.hc$order]))

然后在绘图函数中用 vare.dist.long3 替换 vare.dist.long

奇怪的是,当我尝试对 Var2(如在#Step 2 行)我收到以下错误:

Error in mutate_impl(.data, dots): Evaluation error: factor level [2] is duplicated.
Traceback:

1. vare.dist.long %>% mutate(Var2 = factor(Var2, levels = Var2[vare.hc$order]))
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. mutate(., Var2 = factor(Var2, levels = Var2[vare.hc$order]))
10. mutate.data.frame(., Var2 = factor(Var2, levels = Var2[vare.hc$order]))
11. as.data.frame(mutate(tbl_df(.data), ...))
12. mutate(tbl_df(.data), ...)
13. mutate.tbl_df(tbl_df(.data), ...)
14. mutate_impl(.data, dots)

我在这里错过了什么?为什么我不能改变 Var2,据我所知,它与 Var1 几乎相同,但顺序不同?

提供给 levels 参数的向量不应有任何重复项。如果您在控制台中键入以下内容,您将看到您为 Var2.

中的所有数字提供了相同的级别
vare.dist.long$Var2[vare.hc$order]
# [1] "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18" "18"
# [19] "18" "18" "18" "18" "18" "18"

我认为以下方法可行。 unique(Var1)unique(Var2) 是为了确保没有重复。

vare.dist.long1 <- vare.dist.long %>% mutate(Var1 = factor(Var1, levels = unique(Var1)[vare.hc$order]))

vare.dist.long2 <- vare.dist.long %>% mutate(Var2 = factor(Var2, levels = unique(Var2)[vare.hc$order]))