R:ggplot2 按变量分组的多元回归线

R: ggplot2 multiple regression lines grouped by variable

我有一个包含 3 列的数据框(下面的示例)。我的目标是在 y 轴上设置变量 "Return",在 x 轴上设置 "BetaRealized"。基于此,我希望有两条回归线按 "SML" 分组,例如一条回归线用于两个 "Theoretical" 值,一条用于 10 "Empirical" 值。我最好使用 ggplot2.

我已经查看了其他几个问题,但找不到适合我的问题。由于我是 R 的新手,我将不胜感激任何帮助。如有必要,请随时帮助我为未来的用户改进我的问题。

可重现的数据样本:

structure(list(SML = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L), .Label = c("Empirical", "Theoretical"), class = "factor"), 
    Return = c(0.00136162543341773, 0.00327371856919072, 0.00402550498386094, 
    0.00514512870557883, 0.00491788632261087, 0.00501053666090353, 
    0.00485590289408263, 0.00576880451680399, 0.00579134238930521, 
    0.00704131096883141, 0.00471917614445859, 0), BetaRealized = c(0.42574984058487, 
    0.576898009418581, 0.684024167075167, 0.763551381826944, 
    0.833875797322081, 0.902738972263857, 0.976227211834564, 
    1.06544414896672, 1.19436401770255, 1.50932083346054, 0.893219438045588, 
    0)), class = "data.frame", row.names = c(NA, -12L))

根据 AntoniosK 的评论,解决方案似乎是按以下方式将 geom_smooth 与颜色参数一起使用。首先,将样本数据转换为数据帧:

df<-data.frame(structure(list(SML = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L), .Label = c("Empirical", "Theoretical"), class = "factor"), 
Return = c(0.00136162543341773, 0.00327371856919072, 0.00402550498386094, 
0.00514512870557883, 0.00491788632261087, 0.00501053666090353, 
0.00485590289408263, 0.00576880451680399, 0.00579134238930521, 
0.00704131096883141, 0.00471917614445859, 0), BetaRealized = c(0.42574984058487, 
0.576898009418581, 0.684024167075167, 0.763551381826944, 
0.833875797322081, 0.902738972263857, 0.976227211834564, 
1.06544414896672, 1.19436401770255, 1.50932083346054, 0.893219438045588, 
0)), class = "data.frame", row.names = c(NA, -12L)))

在序列中,像这样调用ggplot:

ggplot(df, aes(BetaRealized, Return, color = SML)) + geom_point()+geom_smooth(method=lm, se=FALSE)

输出将是这个:graph

此外,您可以使用包 ggpubr 添加方程:

ggplot(df, aes(BetaRealized, Return, color = SML)) + geom_point()+stat_smooth(method=lm, se=FALSE)+
stat_regline_equation()

最后,根据你的objectvei,使用facet_wrap来区分类别可能会很有趣:

ggplot(df, aes(BetaRealized, Return, color = SML)) + geom_point()+ 
    stat_smooth(method=lm, se=FALSE)+ facet_wrap(~SML)+
    stat_regline_equation()

图像将如下所示:graph2