Pandas:带条件减法的累加和

Pandas: cumulative sum with conditional subtraction

考虑一个 pandas 数据框,例如:

>> df

date_time            op_type  price  volume
01-01-1970 9:30:01     ASK     100    1800 
01-01-1970 9:30:25     ASK      90    1000      
01-01-1970 9:30:28     BID      90     900
01-01-1970 9:30:28    TRADE     90     900
01-01-1970 9:31:01     BID      80     500
01-01-1970 9:31:09     ASK      80     100
01-01-1970 9:31:09    TRADE     80     100

我想做三个计算:i) op_type == "ASK" 行的交易量的累计总和; ii) op_type == "BID" 行的交易量累计总和;和 iii) 前两卷的总和。

这很简单,但是op_type == "TRADE"操作有一个条件:

  1. 只要有一个 TRADE 操作,其 priceBID 操作上的 price 匹配,我想减去 TRADE操作量来自累计BID量。

  2. 只要有一个 TRADE 操作,其 priceASK 操作上的 price 匹配,我想减去 TRADE操作量从累计ASK量。

我正在寻找的输出是:

>> df

date_time            op_type  price  volume  ASK_vol  BID_vol  BIDASK_vol
01-01-1970 9:30:01     ASK     100    1800    1800       0        1800
01-01-1970 9:30:25     ASK      90    1000    2800       0        2800
01-01-1970 9:30:28     BID      90     900    2800      900       3700
01-01-1970 9:30:28    TRADE     90     900    2800       0        2800
01-01-1970 9:31:01     BID      80     500    2800      500       3300
01-01-1970 9:31:09     ASK      80     100    2900      500       3400
01-01-1970 9:31:09    TRADE     80     100    2800      500       3300

我读过 this question 但我不确定如何将条件减法合并到该答案中。如果有任何帮助,我将不胜感激。谢谢。

IIUC,这就是你需要的。

a= np.where(df['op_type'] == 'ASK',df.volume,0)
b= np.where(df['op_type'] == 'BID',df.volume,0)
a_t = (np.where(df['op_type'] == 'TRADE',
          (np.where(df['op_type'].shift(1) == 'ASK',
                    (np.where(df['volume']==df['volume'].shift(1),-df.volume,0)),0)),0))
b_t = (np.where(df['op_type'] == 'TRADE',
          (np.where(df['op_type'].shift(1) == 'BID',
                    (np.where(df['volume']==df['volume'].shift(1),-df.volume,0)),0)),0))
df['ASK_vol']=(np.where(a_t!=0,a_t,a)).cumsum()
df['BID_vol']=(np.where(b_t!=0,b_t,b)).cumsum()
df['BIDASK_vol']= df['ASK_vol']+df['BID_vol']

输出

           date_time    op_type     price   volume  ASK_vol BID_vol BIDASK_vol
01-01-1970  9:30:01     ASK         100     1800    1800    0       1800
01-01-1970  9:30:25     ASK         90      1000    2800    0       2800
01-01-1970  9:30:28     BID         90      900     2800    900     3700
01-01-1970  9:30:28     TRADE       90      900     2800    0       2800
01-01-1970  9:31:01     BID         80      500     2800    500     3300
01-01-1970  9:31:09     ASK         80      100     2900    500     3400
01-01-1970  9:31:09     TRADE       80      100     2800    500     3300