将多维数组转换为数据框 Python

Convert a multidimensional array to a data frame Python

我有如下数据集:

data={ 'StoreID':['a','b','c','d'],
       'Sales':[1000,200,500,800],
       'Profit':[600,100,300,500]
}
data=pd.DataFrame(data)
data.set_index(['StoreID'],inplace=True,drop=True)
X=data.values
from sklearn.metrics.pairwise import euclidean_distances
dist=euclidean_distances(X)

现在我得到如下数组:

array([[0. ,943,583,223],
       [943, 0.,360,721],
       [583,360,0., 360],
       [223,721,360, 0.]])

我的目的是获得独特的商店组合及其相应的距离。我希望最终结果作为下面的数据框:

Store   NextStore   Dist
a       b           943
a       c           583
a       d           223
b       c           360
b       d           721
c       d           360

感谢您的帮助!

您可能需要 pandas.melt 将距离矩阵“逆轴”转换为高瘦格式。

m = pd.DataFrame(dist)
m.columns = list('abcd')
m['Store'] = list('abcd')

...产生:

            a           b           c           d Store
0    0.000000  943.398113  583.095189  223.606798     a
1  943.398113    0.000000  360.555128  721.110255     b
2  583.095189  360.555128    0.000000  360.555128     c
3  223.606798  721.110255  360.555128    0.000000     d

将数据融为高瘦格式:

pd.melt(m, id_vars=['Store'], var_name='nextStore')

   Store nextStore       value
0      a         a    0.000000
1      b         a  943.398113
2      c         a  583.095189
3      d         a  223.606798
4      a         b  943.398113
5      b         b    0.000000
6      c         b  360.555128
7      d         b  721.110255
8      a         c  583.095189
9      b         c  360.555128
10     c         c    0.000000
11     d         c  360.555128
12     a         d  223.606798
13     b         d  721.110255
14     c         d  360.555128
15     d         d    0.000000

去掉多余的行,将dist转int,排序:

df2 = pd.melt(m, id_vars=['Store'], 
                 var_name='NextStore',
                 value_name='Dist')
df3 =  df2[df2.Store < df2.NextStore].copy()
df3.Dist = df3.Dist.astype('int')
df3.sort_values(by=['Store', 'NextStore'])

   Store NextStore  Dist
4      a         b   943
8      a         c   583
12     a         d   223
9      b         c   360
13     b         d   721
14     c         d   360