ggplots 存储在绘图列表中以在 for 循环中生成绘图时尊重变量值

ggplots stored in plot list to respect variable values at time of plot generation within for loop

我有一个精心设计的绘图例程,可以生成带有附加散点层的箱形图,并将它们添加到绘图列表中。

如果直接通过 print(current_plot_complete).

在 for 循环期间创建,则例程会生成正确的图

但是,如果在 for 循环期间将它们添加到绘图列表中,仅在最后打印,则绘图不正确:最终索引用于生成 all 图(而不是生成图时的当前索引)。 这似乎是默认的 ggplot2 行为,我正在寻找在当前用例中规避它的解决方案。

问题似乎出在 y = eval(parse(text=(paste0(COL_i)))) 中,其中使用全局环境(因此使用最终索引值)而不是循环执行时的当前值。

我尝试了各种方法使 eval() 使用正确的变量值,例如local(…) 或指定环境 – 但没有成功。

下面提供了一个非常简化的 MWE。

MWE

原始例程比此 MWE 复杂得多,因此 for 循环不能轻易替换为 apply 系列的成员。

# create some random data
data_temp <- data.frame(
"a" = sample(x = 1:100, size  = 50),
"b" = rnorm(n = 50, mean = 45, sd = 1),
"c" = sample(x = 20:70, size  = 50), 
"d" = rnorm(n = 50, mean = 40, sd = 15),
"e" = rnorm(n = 50, mean = 50, sd = 10),
"f" = rnorm(n = 50, mean = 45, sd = 1),
"g" = sample(x = 20:70, size  = 50)
)
COLs_current <- c("a", "b", "c", "d", "e") # define COLs of data to include in box plots
choice_COLs <- c("a", "d")      # define COLs of data to add scatter to

plot_list <- list(NA)
plot_index <- 1

for (COL_i in choice_COLs) {

  COL_i_index <- which(COL_i == COLs_current)

  # Generate "basis boxplot" (to plot scatterplot on top)
  boxplot_scores <- data_temp %>% 
    gather(COL, score, all_of(COLs_current)) %>%
    ggplot(aes(x = COL, y = score)) +
    geom_boxplot() 

  # Get relevant data of COL_i for scattering: data of 4th quartile
  quartile_values <- quantile(data_temp[[COL_i]])
  threshold <- quartile_values["75%"]           # threshold = 3. quartile value
  data_temp_filtered <- data_temp %>%
    filter(data_temp[[COL_i]] > threshold) %>%  # filter the data of the 4th quartile
    dplyr::select(COLs_current)                 

  # Create layer of scatter for 4th quartile of COL_i
  scatter_COL_i <- geom_point(data=data_temp_filtered, mapping = aes(x = COL_i_index, y = eval(parse(text=(paste0(COL_i))))), color= "orange")

  # add geom objects to create final plot for COL_i
  current_plot_complete <- boxplot_scores + scatter_COL_i 

  print(current_plot_complete)

  plot_list[[plot_index]] <- current_plot_complete 
  plot_index <- plot_index + 1
}

plot_list

我认为问题在于 ggplot 使用惰性求值。渲染 list 时,循环索引有其最终值,即用于渲染列表中所有绘图的值。

This post 相关。

我提出这个解决方案,但没有告诉您为什么它不像您那样工作:

l <- lapply(choice_COLs, temporary_function)

temporary_function <- function(COL_i){
    COL_i_index <- which(COL_i == COLs_current)

    # Generate "basis boxplot" (to plot scatterplot on top)
    boxplot_scores <- data_temp %>% 
        gather(COL, score, all_of(COLs_current)) %>%
        ggplot(aes(x = COL, y = score)) +
        geom_boxplot() 

    # Get relevant data of COL_i for scattering: data of 4th quartile
    quartile_values <- quantile(data_temp[[COL_i]])
    threshold <- quartile_values["75%"]           # threshold = 3. quartile value
    data_temp_filtered <- data_temp %>%
        filter(data_temp[[COL_i]] > threshold) %>%  # filter the data of the 4th quartile
        dplyr::select(COLs_current)                 

    # Create layer of scatter for 4th quartile of COL_i
    scatter <- geom_point(data=data_temp_filtered,
                          mapping = aes(x = COL_i_index,
                                        y = eval(parse(text=(paste0(COL_i))))),
                          color= "orange")

    # add geom objects to create final plot for COL_i
    current_plot_complete <-  boxplot_scores + scatter

    return(current_plot_complete)
    }

当你使用lapply时你不会有这样的问题。 它的灵感来自