如何分配标题和其他文本以更正刻面网格中的图

How to assign title and other texts to correct plots in facet grid

我正在尝试可视化与 Seaborn 及其 .map 功能的相关性。然而,它总是 return 情节带有错误的标题和属于不同情节的其他文本,而不是显示它的情节。

例如(相关线应该更陡,R值越大):

我用这段代码生成了这个图:

values = [[True, 2, 5], [True, 3, 7], [True, 4, 9], [True, 5, 11], [True, 6, 13], [True, 7, 15],
[False, 2, 3], [False, 3, 3], [False, 4, 15], [False, 5, 4], [False, 6, 1], [False, 7, 5]]

data = pd.DataFrame(values, columns = ["open",'col_A', 'col_B'])

group_by = "open"


grouped = data["col_A"].groupby(data[group_by])
correlation = grouped.corr(data["col_B"], method="pearson")
data = data[data["col_A"].notna()]
data = data[data["col_B"].notna()]
data["col_A"] = pd.to_numeric(data["col_A"])
data["col_B"] = pd.to_numeric(data["col_B"])
data["col_A"] = np.log(1 + data["col_A"])
data["col_B"] = np.log(1 + data["col_B"])

g = sns.FacetGrid(data, col=group_by, col_wrap=4)
g.map(sns.regplot, "col_A", "col_B")
col_order = data[group_by].unique() 
print(type(col_order), col_order)
for txt, title in zip(g.axes.flat, col_order):
    txt.set_title(title)   
    # add text
    txt.text(1.2, 1.2, "R = " + str(correlation[title]), fontsize = 12)
                        
plt.show()

当我使用这个方法时,它工作正常:

sns.lmplot(data=data, x="col_A", y="col_B", col=group_by, col_wrap=2)

大概是这一行: col_order = 数据[group_by].unique() return 与 FacetGrid 不同的顺序。 我怎样才能使两者的顺序正确且相同。

这里的问题是你告诉 seaborn 使用 groupby 对象,但后来,你忽略了这个对象并定义了不同的 col_order。解决方法是访问groupby对象的组:

import seaborn as sns
import numpy as np
from matplotlib import pyplot as plt
import pandas as pd

values = [[True, 2, 5], [True, 3, 7], [True, 4, 9], [True, 5, 11], [True, 6, 13], [True, 7, 15],
[False, 2, 3], [False, 3, 3], [False, 4, 15], [False, 5, 4], [False, 6, 1], [False, 7, 5]]

data = pd.DataFrame(values, columns = ["open",'col_A', 'col_B'])

group_by = "open"

grouped = data["col_A"].groupby(data[group_by])
correlation = grouped.corr(data["col_B"], method="pearson")
data = data[data["col_A"].notna()]
data = data[data["col_B"].notna()]
data["col_A"] = pd.to_numeric(data["col_A"])
data["col_B"] = pd.to_numeric(data["col_B"])
data["col_A"] = np.log(1 + data["col_A"])
data["col_B"] = np.log(1 + data["col_B"])

g = sns.FacetGrid(data, col=group_by, col_wrap=2)
g.map(sns.regplot, "col_A", "col_B")
col_order = grouped.groups.keys()

for txt, title in zip(g.axes.flat, col_order):
    txt.set_title(title)   
    txt.text(1.2, 1.2, f'R = {correlation[title]:.2}', fontsize = 12)
                        
plt.show()

示例输出:

两个独立点:

  1. R 不是斜率而是 regression coefficient,衡量拟合程度。
  2. 我将字符串表示更改为 f 字符串格式。优点是可以.