根据两列计算斜率 "coordinates"

Question

我有 pandas 看起来与此类似的数据框（日期是索引）：

>>>            J01B_X   J01B_y   J02C_x   J02C_y...
date
2019-06-23     0.45    1.12       4.56    1.1
2019-06-24     0.22    1.18       5.5     0.8
2019-06-25     0.35    1.10       6.1     8.3
...

原来table有58列是这样的，基本上每个观察值都有2个值，x和y值。

我想根据列中的 X 和 Y 值计算斜率：
(0.45 1.12, 0.22 1,18, 0.35 1.10) -> 基于 J01B_X 和 J01B_y
的观察 J01B 斜率 (4.51 1.1 , 5.5 0.8 , 6.1 8.3) -> 基于 J02C_X 和 J02C_y

的观察 J02C 的计算斜率

问题是我有 58 列这样每次基于两列计算它们的斜率。

最后我想要一行，而不是同一个原始 table，根据两列计算斜率，像这样（这是假数字）：

>>>            J01B   J02C    ....   
               0.13    0.05

有什么办法可以做到这一点吗？

Answer 1

该示例创建了一个 pandas 系列，它基本上是一个单维 pandas 对象，如一行。如果你愿意，你可以从中创建一个数据框

import pandas as pd
from scipy import stats

slopeB = stats.linregress(df['J01B_X'], df['J01B_y'] )
slopeB = slopeB[0]

slopeC = stats.linregress(df['J02C_x'], df['J02C_y'] )
slopeC = slopeC[0]

#Create Pandas series with slope data
slopes = pd.Series([slopeB, slopeC], index = ['J01B', 'J02C'], name="Slope")
slopedf = pd.DataFrame(slopes).T

斜坡看起来像这样：

J01B   -0.278195
J02C    4.233791
Name: Slope, dtype: float64

slopedf 看起来像这样，是一个只有一行的 DataFrame：

           J01B      J02C
Slope -0.278195  4.233791

slopes 和 slopedf 都可以以相同的方式查询，但是系列将 return 条目的数值，而 slopedf 将 return 具有数据的单个元素系列。尽管该系列在打印时显示为一列，但我认为这就是您想要的。

#output of slopes['J01B']
-0.2781954887218037

#output of slopedf['J01B']
Slope   -0.278195
Name: J01B, dtype: float64

根据两列计算斜率 "coordinates"

Calculate the slope based on two columns "coordinates"

python

linear-regression

pandas