按索引计算每个向量行的相关数据帧 Python

Question

我有一个包含 500 列按日期索引的数据框，包含四年的数据。

|日期 |一个|艾尔 |亚太经合组织 |美国航空航天局 | ABC ......

| 2004 年 1 月 2 日 | 18.442521 |25.954398 |1.38449 |11.528444......

| 2004 年 1 月 5 日 | 18.922795 |25.718507 |1.442394 |11.919131...

| 2004 年 1 月 6 日 | 19.518334 |26.177538 |1.437189 |11.870028....

。 . .等...

我想计算每一天的 Pearson 相关矩阵，所以每一行。我想按日期保存矩阵，以 R 可读的最 space 最有效的方式。（现在我的目标是单独的工作表，按索引日期，在 Excel 中。我愿意接受建议。）

我尝试了几种方法，但这似乎是最有希望的，因为我无法将 corr() 应用于 df.groupby。

但是这个方法返回了空数据帧，现在我卡住了！我正在寻找一种不涉及迭代的方法。

def do_Corr(df_group):
"""Apply the function to each group in the data and return one result."""
X = df_group.corr()
return X

df.groupby([df.index.year,df.index.month,df.index.day]).apply(do_Corr).dropna()

Answer 1

您可能想要 df.T.corr()。 .T 转置数据帧，因此行变成列，然后您可以应用 .corr() 方法。

按索引计算每个向量行的相关数据帧 Python

Compute Correlation Dataframe for each Vector Row by Index Python

dataframe

python-3.x

pandas

pearson-correlation