将垂直矩阵转换为相关矩阵。 Python

Convert vertical matrix to correlation matrix. Python

我使用 pd.DataFrame.corr() 方法从我的 DataFrame 创建一个相关矩阵,做了一些我切断某些值的东西以获得类似于 DF_interactions 的 table以下。我现在想把它带回相关矩阵样式,例如下面的 DF_corr

使用 pandasnumpysklearnscipy 转换 table 的最有效方法是什么与相关式矩阵的相互作用?

我已经包含了我填充此数据框的幼稚方法...

#Create table of interactions 
DF_interactions=pd.DataFrame([["A","B",0.1],
                              ["A","C",0.4],
                              ["B","C",0.3],
                              ["A","D",0.4]],columns=["var1","var2","corr"])
#   var1 var2  corr
# 0    A    B   0.1
# 1    A    C   0.4
# 2    B    C   0.3
# 3    A    D   0.4
n,m = DF_interactions.shape
#4 3
#Show which labels would be in correlation matrix for rows/columns
nodes = set(DF_interactions["var1"]) | set(DF_interactions["var2"])
#set(['A', 'C', 'B', 'D'])

#Create empty DataFrame to fill
DF_corr = pd.DataFrame(np.zeros((len(nodes),len(nodes))), columns = sorted(nodes),index=sorted(nodes))
#    A  B  C  D
# A  0  0  0  0
# B  0  0  0  0
# C  0  0  0  0
# D  0  0  0  0

#Naive way to fill it
for i in range(n):
    var1 = DF_interactions.iloc[i,0]
    var2 = DF_interactions.iloc[i,1]
    corr = DF_interactions.iloc[i,2]
    DF_corr.loc[var1,var2] = corr
    DF_corr.loc[var2,var1] = corr
#      A    B    C    D
# A  0.0  0.1  0.4  0.4
# B  0.1  0.0  0.3  0.0
# C  0.4  0.3  0.0  0.0
# D  0.4  0.0  0.0  0.0

假设您的 table 交互仅包含一半相关性(如果不确定,请添加 .drop_duplicates()):

corr = pd.concat([DF_interactions, DF_interactions.rename(columns={'var1': 'var2', 'var2': 'var1'})])

然后使用.pivot():

corr = corr.pivot(index='var1', columns='var2', values='corr')

var2    A    B    C    D
var1                    
A     NaN  0.1  0.4  0.4
B     0.1  NaN  0.3  NaN
C     0.4  0.3  NaN  NaN
D     0.4  NaN  NaN  NaN

如果您更喜欢 0 缺失交互的值,请使用 .fillna(0)