使用R对Sankey/Alluvial图中的部分变量进行美化排序
Beautifying and sorting some variables in the Sankey/Alluvial diagram using R
我正在努力提高我在数据可视化方面的技能,我几乎得到了我想要的。但在某些时候,我被卡住了,无法再前进了。请注意,伙计们,我在这里做了广泛的研究,试图找出我的疑惑,这对我有很大帮助。
这是我的数据集:
https://app.box.com/s/pp5p5chgypn6ba33anotie7wlxvdu01v
这是我的代码:
library(tidyverse)
library(ggalluvial)
library(alluvial)
A_col <- "firebrick3"
B_col <- "darkorange"
C_col <- "aquamarine2"
D_col <- "dodgerblue2"
E_col <- "darkviolet"
F_col <- "chartreuse2"
G_col <- "goldenrod1"
H_col <- "gray73"
set.seed(39)
ggplot(df,
aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
geom_alluvium(aes(fill = Positions, color = Positions),
width = 4/12, alpha = 0.5, knot.pos = 0.3) +
geom_stratum(width = 4/12, color = "grey36") +
geom_text(stat = "stratum", label.strata = TRUE) +
scale_x_continuous(breaks = 1:3,
labels = c("Activity", "Category", "Positions/Movements"), expand = c(.01, .05)) +
ylab("Time 24 hours") +
scale_fill_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
ggtitle("Physical Activity during the week and weekend") +
theme_minimal() +
theme(legend.position = "none", panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.text.y = element_blank(),
axis.text.x = element_text(size = 12, face = "bold"))
# I also have this code that I run without pre-choosing the colours.
# I like this one because the flow diagram doesn't have any border.
ggplot(df,
aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
scale_x_discrete(limits = c("Activity", "Category", "Positions/Moviments"),
expand = c(.01, .05)) +
ylab("Time 24 hours") +
geom_alluvium(aes(fill = Positions), width = 4/12, alpha = 0.5, knot.pos = 0.3) +
geom_stratum() + geom_text(stat = "stratum", label.strata = TRUE) +
theme_minimal() +
ggtitle("Physical Activity during the week and weekend") +
theme(legend.position = "none", panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.text.y = element_blank(),
axis.text.x = element_text(size = 12, face = "bold"))
这是可视化:
有三件事我真的做不到:
排序Category
周和周末之后,如Working
、Non Working
、Sleep Week
、Leisure
和 Sleep Weekend
.
对Positions/Movements进行排序,例如Sitting
、Lying
、Standing
、Moving
、Stairs
、Walk Slow
、Walk Fast
和 Running
。另外,我想用与流程图相同的颜色填充此列的方块。还有就是有些名字没有足够的space,我不知道是否可以重新设置space来容纳它们,或者把它们放在外面用箭头指示所属的方块给他们。差点忘了,有没有办法手动给每个变量分配颜色,比如color black
for Walk Slow
?另外,如果可能的话,我想去掉流程图边缘的线条。
有没有办法叠加名称位置和运动?
有什么方法可以改进此可视化并使其美观?
提前致谢,路易斯
这是解决您的一些问题的解决方案。
df <- read_csv('Desktop/plot_alluvial_category_position_plus_moviments.csv')
positions <- c("Sitting", "Lying", "Standing", "Moving", "Stairs", "Walk Slow",
"Walk Fast", "Running")
df$Positions <- factor(df$Positions, levels = positions, labels = positions)
category <- c("Working", "Non Working", "Sleep Week", "Leisure",
"Sleep Weekend")
df$Category <- factor(df$Category, levels = category, labels = category)
ggplot(df,
aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
geom_alluvium(aes(fill = Positions),
width = 4/12, alpha = 0.5, knot.pos = 0.3) +
geom_stratum(width = 4/12, color = "grey36") +
geom_text(stat = "stratum", label.strata = TRUE, min.height=100) +
scale_x_continuous(breaks = 1:3,
labels = c("Activity", "Category", "Positions\nMovements"), expand = c(.01, .05)) +
ylab("Time 24 hours") +
scale_fill_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
ggtitle("Physical activity during the week and weekend") +
theme_minimal() +
theme(legend.position = "none", panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.text.y = element_blank(),
axis.text.x = element_text(size = 12, face = "bold"))
- 要对层进行排序,您需要将
Category
和 Position
列转换为设置级别顺序的因子。
- 要去除流程图的边缘,从您的
aes
级别去除 color = Position
就足够了。
- 您可以通过在标签中添加换行符来堆叠名称 Position 和 Movement。
- 您可以将颜色分配给层,但前提是类别始终相同(查看
ggalluvial
文档中的一些示例)。
- 为了避免小层重叠,可以使用
ggalluvial
版本0.9.2
中引入的geom_text
中的min.height
参数,如图here.
我正在努力提高我在数据可视化方面的技能,我几乎得到了我想要的。但在某些时候,我被卡住了,无法再前进了。请注意,伙计们,我在这里做了广泛的研究,试图找出我的疑惑,这对我有很大帮助。
这是我的数据集:
https://app.box.com/s/pp5p5chgypn6ba33anotie7wlxvdu01v
这是我的代码:
library(tidyverse)
library(ggalluvial)
library(alluvial)
A_col <- "firebrick3"
B_col <- "darkorange"
C_col <- "aquamarine2"
D_col <- "dodgerblue2"
E_col <- "darkviolet"
F_col <- "chartreuse2"
G_col <- "goldenrod1"
H_col <- "gray73"
set.seed(39)
ggplot(df,
aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
geom_alluvium(aes(fill = Positions, color = Positions),
width = 4/12, alpha = 0.5, knot.pos = 0.3) +
geom_stratum(width = 4/12, color = "grey36") +
geom_text(stat = "stratum", label.strata = TRUE) +
scale_x_continuous(breaks = 1:3,
labels = c("Activity", "Category", "Positions/Movements"), expand = c(.01, .05)) +
ylab("Time 24 hours") +
scale_fill_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
ggtitle("Physical Activity during the week and weekend") +
theme_minimal() +
theme(legend.position = "none", panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.text.y = element_blank(),
axis.text.x = element_text(size = 12, face = "bold"))
# I also have this code that I run without pre-choosing the colours.
# I like this one because the flow diagram doesn't have any border.
ggplot(df,
aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
scale_x_discrete(limits = c("Activity", "Category", "Positions/Moviments"),
expand = c(.01, .05)) +
ylab("Time 24 hours") +
geom_alluvium(aes(fill = Positions), width = 4/12, alpha = 0.5, knot.pos = 0.3) +
geom_stratum() + geom_text(stat = "stratum", label.strata = TRUE) +
theme_minimal() +
ggtitle("Physical Activity during the week and weekend") +
theme(legend.position = "none", panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.text.y = element_blank(),
axis.text.x = element_text(size = 12, face = "bold"))
这是可视化:
有三件事我真的做不到:
排序
Category
周和周末之后,如Working
、Non Working
、Sleep Week
、Leisure
和Sleep Weekend
.对Positions/Movements进行排序,例如
Sitting
、Lying
、Standing
、Moving
、Stairs
、Walk Slow
、Walk Fast
和Running
。另外,我想用与流程图相同的颜色填充此列的方块。还有就是有些名字没有足够的space,我不知道是否可以重新设置space来容纳它们,或者把它们放在外面用箭头指示所属的方块给他们。差点忘了,有没有办法手动给每个变量分配颜色,比如colorblack
forWalk Slow
?另外,如果可能的话,我想去掉流程图边缘的线条。有没有办法叠加名称位置和运动?
有什么方法可以改进此可视化并使其美观?
提前致谢,路易斯
这是解决您的一些问题的解决方案。
df <- read_csv('Desktop/plot_alluvial_category_position_plus_moviments.csv')
positions <- c("Sitting", "Lying", "Standing", "Moving", "Stairs", "Walk Slow",
"Walk Fast", "Running")
df$Positions <- factor(df$Positions, levels = positions, labels = positions)
category <- c("Working", "Non Working", "Sleep Week", "Leisure",
"Sleep Weekend")
df$Category <- factor(df$Category, levels = category, labels = category)
ggplot(df,
aes(y = Time, axis1 = Activity, axis2 = Category, axis3 = Positions)) +
geom_alluvium(aes(fill = Positions),
width = 4/12, alpha = 0.5, knot.pos = 0.3) +
geom_stratum(width = 4/12, color = "grey36") +
geom_text(stat = "stratum", label.strata = TRUE, min.height=100) +
scale_x_continuous(breaks = 1:3,
labels = c("Activity", "Category", "Positions\nMovements"), expand = c(.01, .05)) +
ylab("Time 24 hours") +
scale_fill_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
scale_color_manual(values = c(A_col, B_col, C_col, D_col, E_col, F_col, G_col, H_col)) +
ggtitle("Physical activity during the week and weekend") +
theme_minimal() +
theme(legend.position = "none", panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.text.y = element_blank(),
axis.text.x = element_text(size = 12, face = "bold"))
- 要对层进行排序,您需要将
Category
和Position
列转换为设置级别顺序的因子。 - 要去除流程图的边缘,从您的
aes
级别去除color = Position
就足够了。 - 您可以通过在标签中添加换行符来堆叠名称 Position 和 Movement。
- 您可以将颜色分配给层,但前提是类别始终相同(查看
ggalluvial
文档中的一些示例)。 - 为了避免小层重叠,可以使用
ggalluvial
版本0.9.2
中引入的geom_text
中的min.height
参数,如图here.