如何报告 pandas 中一列与其他列的相关性

How to report the corellation of one column in relation to other columns in pandas

我有一个数据为:

df1 df2 df3 df4 df5
1    3  3   4    5
4    4  3   4    3
5    5  1   -2   1
9    7  3   0    -2

我想报告 df1 列与其他列(df2、df3、df4 和 df5)之间的相关性很强

输出应如下所示:

df1 is strongly corelated to df2
 df1 is not strongly corelated to df3
 df1 is not strongly corelated to df4
 df1 is strongly corelated to df5

您可以使用 df.corr() 计算列的成对相关性

Onde ide a 是使用 DataFrame.corrwith 并且此处定义了强相关性,例如绝对值更大,例如 0.7:

m = df.corrwith(df.pop('df1')).abs().gt(0.7)
print (m)
df2     True
df3    False
df4    False
df5     True
dtype: bool

for k, v in m.items():
    if v:
       print (f'df1 is strongly corelated to {k}')
    else:
       print (f'df1 is not strongly corelated to {k}')
       
df1 is strongly corelated to df2
df1 is not strongly corelated to df3
df1 is not strongly corelated to df4
df1 is strongly corelated to df5