hist 列依赖于其他列的相对频率 (R)
hist column in dependency of the relative frequency of other column (R)
假设我们有 table:
x y
1 43
1 54
2 54
3 22
2 22
1 43
我只想在 x 轴上显示 1、2、3,这样它可以识别唯一值,但除此之外,它还应该以 % 显示数字 43 在 1、54 中出现的频率,依此类推。两列都应该因式分解吗?
这是我的解决方案:
library("ggplot2")
library("dplyr")
library("magrittr")
library("tidyr")
df <- data.frame(x = c(1,1,2,3,2,1), y = c(43,54,54,22,22,43))
#Creating a counter that will keep track
#Of how many of each number in y exist for each x category
df$n <- 1
df %<>% #This is a bidirectional pipe here that overwrites 'df' with the result!
group_by(x, y) %>% #Unidirectional pipe
tally(n) %>%
mutate(n = round(n/sum(n), 2)) #Calculating as percentage
#Plotting
df %>%
ggplot(aes(fill = as.factor(y), y = n, x = x)) +
geom_bar(position = "fill", stat = "identity") +
scale_y_continuous(labels = scales::percent) +
labs(y = "Percentage contribution from each y category") +
#Adding the percentage values as labels
geom_text(aes(label = paste0(n*100,"%")), position = position_stack(vjust = 0.5), size = 2)
注意:y 轴值以百分比表示,因为 position="fill"
传递给 geom_bar()
。
假设我们有 table:
x y
1 43
1 54
2 54
3 22
2 22
1 43
我只想在 x 轴上显示 1、2、3,这样它可以识别唯一值,但除此之外,它还应该以 % 显示数字 43 在 1、54 中出现的频率,依此类推。两列都应该因式分解吗?
这是我的解决方案:
library("ggplot2")
library("dplyr")
library("magrittr")
library("tidyr")
df <- data.frame(x = c(1,1,2,3,2,1), y = c(43,54,54,22,22,43))
#Creating a counter that will keep track
#Of how many of each number in y exist for each x category
df$n <- 1
df %<>% #This is a bidirectional pipe here that overwrites 'df' with the result!
group_by(x, y) %>% #Unidirectional pipe
tally(n) %>%
mutate(n = round(n/sum(n), 2)) #Calculating as percentage
#Plotting
df %>%
ggplot(aes(fill = as.factor(y), y = n, x = x)) +
geom_bar(position = "fill", stat = "identity") +
scale_y_continuous(labels = scales::percent) +
labs(y = "Percentage contribution from each y category") +
#Adding the percentage values as labels
geom_text(aes(label = paste0(n*100,"%")), position = position_stack(vjust = 0.5), size = 2)
注意:y 轴值以百分比表示,因为 position="fill"
传递给 geom_bar()
。