同样，我在 R 上有 4 个图表，x 轴不同，但趋势概况相似。我试图覆盖它们，但它们没有对齐

Question

有人协助我在此 link 上叠加两个 x 轴不同的图表。但是，我现在正在尝试 叠加 4 个图 。我试图覆盖它们，但它们没有对齐。我需要帮助来叠加这四张图。

我最初的试用代码如下：

我的原始数据在下面linkhttps://drive.google.com/drive/folders/1ZZQAATkbeV-Nvq1YYZMYdneZwMvKVUq1?usp=sharing.

用于执行的代码：

first <- ggplot(data = first,                
                  aes(x, y)) + 
       geom_line(pch = 1)

  second <- ggplot(data = second,                
                  aes(x, y)) + 
       geom_line(pch = 1)

  third <- ggplot(data = third,                
                  aes(x, y)) + 
       geom_line(pch = 1)

  fourth <- ggplot(data = fourth,                
                  aes(x, y)) + 
       geom_line(pch = 1)

 first$match <- first$x
 second$match <- second$x - second$x[second$y == max(second$y)] + first$x[first$y == max(first$y)]
 third$match <- third$x
 fourth$match <- fourth$x
 first$series = "first"
 second$series = "second"
 third$series = "third"
 fourth$series = "fourth"

 all_data <- rbind(first, second, third, fourth)

 ggplot(all_data) + geom_line(aes(x = match, y, color = series)) +
                                scale_x_continuous(name = "X, arbitrary units") + 
                                theme(axis.text.x = element_blank())

非常感谢您的帮助。

Answer 1

OP，我想我会为你的问题提出一个解决方案。 OP 有 4 个包含 x 和 y 列的数据集，并希望对齐每个数据集中的峰值，以便它们堆叠在一起。这是我们将所有数据集绘制在一起时的样子：

p <- ggplot(mapping=aes(x=x, y=y)) + theme_bw() +
  geom_line(data=first, aes(color="first")) +
  geom_line(data=second, aes(color="second")) +
  geom_line(data=third, aes(color="third")) +
  geom_line(data=fourth, aes(color="fourth"))

方法如下：

找到每个数据集的峰值 x 值
调整每个峰值 x 值以匹配第一个峰值 x 值
合并数据集并绘制在一起which respects Tidy Data principles

查找峰值并调整 x 值

为了找到峰值，我喜欢使用 pracma 库中的 findpeaks() 函数。您为函数提供数据集的 y 值（按增加的 x 值排列），函数将 return 一个矩阵，每一行代表一个“峰值”，列为您提供 y 值的峰值高度，index 或峰值的数据集行，峰值开始的位置和峰值结束的位置。例如，以下是我们如何应用这一原则以及在其中一个数据集上的结果：

library(pracma)

first <- arrange(first, x)  # arrange first by increasing x
findpeaks(first$y, sortstr = TRUE, npeaks=1)

        [,1] [,2] [,3] [,4]
[1,] 1047.54  402  286  515

参数sortstr=表示我们希望峰列表首先按“最高”排序，我们只对选择第一个峰感兴趣。在这种情况下，我们可以看到 402 是峰的 first 中 x,y 值的索引。所以我们可以通过 first[index,]$x.

访问那个 x 值

我们在这里可能担心的一个问题是，这可能不适用于 fourth，因为 y 的最大值实际上不是感兴趣的峰值；但是，如果我们运行函数并测试它，使用 findpeaks() 方法，我们 return 最高峰工作正常：显然函数没有找到“峰值”右边，因为它有一个“向上”，但没有一个“向下”。

下面的函数处理我们需要做的所有步骤：排列、查找峰值和调整峰值。

# find the minimum peak.  We know it's from third, but here's
# how you do it if you don't "know" that

peaks_first <- findpeaks(first$y, sortstr = TRUE, npeaks=1)
peaks_second <- findpeaks(second$y, sortstr = TRUE, npeaks=1)
peaks_third <- findpeaks(third$y, sortstr = TRUE, npeaks=1)
peaks_fourth <- findpeaks(fourth$y, sortstr = TRUE, npeaks=1)

# minimum peak x value
peak_x <- min(c(first[peaks_first[2],]$x, second[peaks_second[2],]$x, third[peaks_third[2],]$x, fourth[peaks_fourth[2],]$x))

# function to use to fix each dataset
fix_x <- function(peak_x, dataset) {
  dataset <- arrange(dataset, x)
  d_peak <- findpeaks(dataset$y, sortstr = TRUE, npeaks=1)
  d_peak_x <- dataset[d_peak[2],]$x
  x_adj <- peak_x - d_peak_x
  dataset$x <- dataset$x + x_adj
  return(dataset)
}

# apply and fix each dataset
fix_first <- fix_x(peak_x, first)
fix_second <- fix_x(peak_x, second)
fix_third <- fix_x(peak_x, third)
fix_fourth <- fix_x(peak_x, fourth)

# combine datasets
fix_first$measure <- 'First'
fix_second$measure <- 'Second'
fix_third$measure <- 'Third'
fix_fourth$measure <- 'Fourth'

fixed <- rbind(fix_first, fix_second, fix_third, fix_fourth)
fixed$measure <- factor(fixed$measure, levels=c('First','Second','Third','Fourth'))

一起谋划

现在 fixed 包含所有数据，我们可以将它们一起绘制：

ggplot(fixed, aes(x=x, y=y, color=measure)) + theme_bw() +
  geom_line()

替代绘图方法

如果您想将线条“堆叠”在彼此之上，这就是所谓的脊线图。关于如何创建山脊线图，我可以展示两种方法：分面或使用 ggridges 和 geom_ridgeline()。我可以证明两者。

# Using facets
ggplot(fixed, aes(x=x, y=y, color=measure)) + theme_bw() +
  geom_line(show.legend = FALSE) +
  facet_grid(measure~.)

请注意，我选择不显示图例，因为条形文本指示相同的信息。

# Using ggridges and geom_ridgeline
ggplot(fixed, aes(x=x, y=measure, color=measure)) + theme_bw() +
  geom_ridgeline(aes(height=y), fill=NA, scale=0.001)

使用 geom_ridgeline() 时，您会注意到 y= 美学成为用于堆叠的列，而您的原始 y 值改为映射到 height= 美学。我还不得不尝试使用 scale=，因为对于离散值，每个 measure 都将被视为整数（1、2、3、4）。你的 height= 值比那个高很多，所以我们必须缩小它们，使它们在这个范围内（缩小约 1000）。

同样，我在 R 上有 4 个图表，x 轴不同，但趋势概况相似。我试图覆盖它们，但它们没有对齐

Again, I have 4 graphs on R, different x axis, but similar trend profile. I tried to overlay them but they are not aligned

compare

r

graph

alignment

ggplot2

查找峰值并调整 x 值

一起谋划

替代绘图方法