两个非连续行的差异 - Pandas

Difference of two non-consecutive rows - Pandas

我有一个历史价目表,我想计算每种货币价格之间的差异。我的代码将通过获取新价格来更新列表并将其附加到数据库。我该怎么做? 这就是元素在 table 上的方式:


     Date        Hour       Currency     Price      Variation
0   2021-05-01  23:19:21    BAT         1.0700
1   2021-05-01  23:19:21    BTC     47922.1400
2   2021-05-01  23:19:21    DOGE        0.3286
3   2021-05-01  23:19:21    ETH      2451.7400
4   2021-05-01  23:35:50    BAT         1.0600
5   2021-05-01  23:35:50    BTC     47557.2700
6   2021-05-01  23:35:50    DOGE        0.3228
7   2021-05-01  23:35:50    ETH      2438.0300
8   2021-05-01  23:37:20    BAT         1.0500
9   2021-05-01  23:37:20    BTC     47467.0200
10  2021-05-01  23:37:20    DOGE        0.3209
11  2021-05-01  23:37:20    ETH      2435.3000

因此,如您所见,货币不是连续放置的。例如:

BAT价格变化:

0 -> 4 : (1.0600-1.0700)/1.0700 = -0.93%
4 -> 8 : (1.0500-1.0600)/1.0600 = -0.94%
last_value_index -> recent_value_index : (recent_value-last_value)/last_value

谢谢!

我们可以按 Currency 分组,然后在 Price 列上应用 pct_change()

df['Variation'] = 100*df.groupby('Currency').Price.pct_change()

或手动计算百分比变化

df['Variation'] = df.groupby('Currency').Price.transform(lambda x: 100*x.diff()/x)

新提供的 df

的输出
    Date    Hour    Currency        Price   Variation
0   2021-05-01  23:19:21    BAT     1.0700      NaN
1   2021-05-01  23:19:21    BTC     47922.1400  NaN
2   2021-05-01  23:19:21    DOGE    0.3286      NaN
3   2021-05-01  23:19:21    ETH     2451.7400   NaN
4   2021-05-01  23:35:50    BAT     1.0600  -0.934579
5   2021-05-01  23:35:50    BTC     47557.2700  -0.761381
6   2021-05-01  23:35:50    DOGE    0.3228  -1.765064
7   2021-05-01  23:35:50    ETH     2438.0300   -0.559195
8   2021-05-01  23:37:20    BAT     1.0500  -0.943396
9   2021-05-01  23:37:20    BTC     47467.0200  -0.189771
10  2021-05-01  23:37:20    DOGE    0.3209  -0.588600
11  2021-05-01  23:37:20    ETH     2435.3000   -0.111976
12  2021-05-02  00:04:40    BAT     1.0200  -2.857143
13  2021-05-02  00:04:40    BTC     46883.6300  -1.229043
14  2021-05-02  00:04:40    DOGE    0.3028  -5.640386
15  2021-05-02  00:04:40    ETH     2397.8200   -1.539030

如果我们想用任何值填充 na,例如 0.0。

df['Variation'] = 100*df.groupby('Currency').Price.pct_change().fillna(0.)