Hide/Drop 使用 ggplot2 的热图中缺少值
Hide/Drop missing values in heat map with ggplot2
我有一个数据框,从 2016 年 1 月 11 日到 1 月 14 日连续缺失值
library(lubridate)
set.seed(123)
timestamp1 <- seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-10 23:59:59"), by = "hour")
timestamp2 <- seq(as.POSIXct("2016-01-15"),as.POSIXct("2016-01-20 23:59:59"), by = "hour")
data_obj <- data.frame(value = c (rnorm(length(timestamp1),150,5),rnorm(length(timestamp2),110,3)),timestamp = c(timestamp1,timestamp2))
data_obj$day <- lubridate::date(data_obj$timestamp)
data_obj$hour <- lubridate::hour(data_obj$timestamp)
当我使用
绘制热图时
ggplot(data_obj,aes(day,hour,fill=value)) + geom_tile()
我得到如下的热图;红色标记的矩形区域对应缺失值
如何完全隐藏这个空白区域并制作连续的热图?
请注意,我不想更改 x 轴日期的格式,也不想用其他颜色显示缺失值。
如果您将天更改为一个因素,它会忽略间隔:
ggplot(data_obj, aes(factor(day),hour,fill=value)) + geom_tile()
根据真实事物的外观,您可能对 x 轴的外观感到满意,也可能不满意。
与@Jacob 的保留日期标签格式和顺序的答案略有不同:
library(lubridate)
set.seed(123)
timestamp1 <- seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-10 23:59:59"), by = "hour")
timestamp2 <- seq(as.POSIXct("2016-01-15"),as.POSIXct("2016-01-20 23:59:59"), by = "hour")
data_obj <- data.frame(value = c (rnorm(length(timestamp1),150,5),
rnorm(length(timestamp2),110,3)),
timestamp = c(timestamp1,timestamp2))
data_obj$day <- lubridate::date(data_obj$timestamp)
data_obj$hour <- lubridate::hour(data_obj$timestamp)
# preserve the date order manally in a factor
data_obj$day_f <- format(data_obj$day, "%b %d")
dplyr::arrange(data_obj, day) %>%
dplyr::distinct(day_f) -> day_f_order
data_obj$day_f <- factor(data_obj$day_f, levels=day_f_order$day_f)
ggplot(data_obj, aes(day_f, hour, fill=value)) +
geom_tile() +
scale_x_discrete(expand=c(0,0), breaks=c("Jan 04", "Jan 18")) +
scale_y_continuous(expand=c(0,0)) +
viridis::scale_fill_viridis(name=NULL) +
coord_equal() +
labs(x=NULL, y=NULL) +
theme(panel.background=element_blank()) +
theme(panel.grid=element_blank()) +
theme(axis.ticks=element_blank()) +
theme(legend.position="bottom")
注意:如果没有明确、非常明显的注释来解释数据缺失,您仍然在向观众误导数据。
我有一个数据框,从 2016 年 1 月 11 日到 1 月 14 日连续缺失值
library(lubridate)
set.seed(123)
timestamp1 <- seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-10 23:59:59"), by = "hour")
timestamp2 <- seq(as.POSIXct("2016-01-15"),as.POSIXct("2016-01-20 23:59:59"), by = "hour")
data_obj <- data.frame(value = c (rnorm(length(timestamp1),150,5),rnorm(length(timestamp2),110,3)),timestamp = c(timestamp1,timestamp2))
data_obj$day <- lubridate::date(data_obj$timestamp)
data_obj$hour <- lubridate::hour(data_obj$timestamp)
当我使用
绘制热图时ggplot(data_obj,aes(day,hour,fill=value)) + geom_tile()
我得到如下的热图;红色标记的矩形区域对应缺失值
如何完全隐藏这个空白区域并制作连续的热图?
请注意,我不想更改 x 轴日期的格式,也不想用其他颜色显示缺失值。
如果您将天更改为一个因素,它会忽略间隔:
ggplot(data_obj, aes(factor(day),hour,fill=value)) + geom_tile()
根据真实事物的外观,您可能对 x 轴的外观感到满意,也可能不满意。
与@Jacob 的保留日期标签格式和顺序的答案略有不同:
library(lubridate)
set.seed(123)
timestamp1 <- seq(as.POSIXct("2016-01-01"),as.POSIXct("2016-01-10 23:59:59"), by = "hour")
timestamp2 <- seq(as.POSIXct("2016-01-15"),as.POSIXct("2016-01-20 23:59:59"), by = "hour")
data_obj <- data.frame(value = c (rnorm(length(timestamp1),150,5),
rnorm(length(timestamp2),110,3)),
timestamp = c(timestamp1,timestamp2))
data_obj$day <- lubridate::date(data_obj$timestamp)
data_obj$hour <- lubridate::hour(data_obj$timestamp)
# preserve the date order manally in a factor
data_obj$day_f <- format(data_obj$day, "%b %d")
dplyr::arrange(data_obj, day) %>%
dplyr::distinct(day_f) -> day_f_order
data_obj$day_f <- factor(data_obj$day_f, levels=day_f_order$day_f)
ggplot(data_obj, aes(day_f, hour, fill=value)) +
geom_tile() +
scale_x_discrete(expand=c(0,0), breaks=c("Jan 04", "Jan 18")) +
scale_y_continuous(expand=c(0,0)) +
viridis::scale_fill_viridis(name=NULL) +
coord_equal() +
labs(x=NULL, y=NULL) +
theme(panel.background=element_blank()) +
theme(panel.grid=element_blank()) +
theme(axis.ticks=element_blank()) +
theme(legend.position="bottom")
注意:如果没有明确、非常明显的注释来解释数据缺失,您仍然在向观众误导数据。