Pandas:带条件减法的累加和
Pandas: cumulative sum with conditional subtraction
考虑一个 pandas 数据框,例如:
>> df
date_time op_type price volume
01-01-1970 9:30:01 ASK 100 1800
01-01-1970 9:30:25 ASK 90 1000
01-01-1970 9:30:28 BID 90 900
01-01-1970 9:30:28 TRADE 90 900
01-01-1970 9:31:01 BID 80 500
01-01-1970 9:31:09 ASK 80 100
01-01-1970 9:31:09 TRADE 80 100
我想做三个计算:i) op_type == "ASK"
行的交易量的累计总和; ii) op_type == "BID"
行的交易量累计总和;和 iii) 前两卷的总和。
这很简单,但是op_type == "TRADE"
操作有一个条件:
只要有一个 TRADE
操作,其 price
与 BID
操作上的 price
匹配,我想减去 TRADE
操作量来自累计BID
量。
只要有一个 TRADE
操作,其 price
与 ASK
操作上的 price
匹配,我想减去 TRADE
操作量从累计ASK
量。
我正在寻找的输出是:
>> df
date_time op_type price volume ASK_vol BID_vol BIDASK_vol
01-01-1970 9:30:01 ASK 100 1800 1800 0 1800
01-01-1970 9:30:25 ASK 90 1000 2800 0 2800
01-01-1970 9:30:28 BID 90 900 2800 900 3700
01-01-1970 9:30:28 TRADE 90 900 2800 0 2800
01-01-1970 9:31:01 BID 80 500 2800 500 3300
01-01-1970 9:31:09 ASK 80 100 2900 500 3400
01-01-1970 9:31:09 TRADE 80 100 2800 500 3300
我读过 this question 但我不确定如何将条件减法合并到该答案中。如果有任何帮助,我将不胜感激。谢谢。
IIUC,这就是你需要的。
a= np.where(df['op_type'] == 'ASK',df.volume,0)
b= np.where(df['op_type'] == 'BID',df.volume,0)
a_t = (np.where(df['op_type'] == 'TRADE',
(np.where(df['op_type'].shift(1) == 'ASK',
(np.where(df['volume']==df['volume'].shift(1),-df.volume,0)),0)),0))
b_t = (np.where(df['op_type'] == 'TRADE',
(np.where(df['op_type'].shift(1) == 'BID',
(np.where(df['volume']==df['volume'].shift(1),-df.volume,0)),0)),0))
df['ASK_vol']=(np.where(a_t!=0,a_t,a)).cumsum()
df['BID_vol']=(np.where(b_t!=0,b_t,b)).cumsum()
df['BIDASK_vol']= df['ASK_vol']+df['BID_vol']
输出
date_time op_type price volume ASK_vol BID_vol BIDASK_vol
01-01-1970 9:30:01 ASK 100 1800 1800 0 1800
01-01-1970 9:30:25 ASK 90 1000 2800 0 2800
01-01-1970 9:30:28 BID 90 900 2800 900 3700
01-01-1970 9:30:28 TRADE 90 900 2800 0 2800
01-01-1970 9:31:01 BID 80 500 2800 500 3300
01-01-1970 9:31:09 ASK 80 100 2900 500 3400
01-01-1970 9:31:09 TRADE 80 100 2800 500 3300
考虑一个 pandas 数据框,例如:
>> df
date_time op_type price volume
01-01-1970 9:30:01 ASK 100 1800
01-01-1970 9:30:25 ASK 90 1000
01-01-1970 9:30:28 BID 90 900
01-01-1970 9:30:28 TRADE 90 900
01-01-1970 9:31:01 BID 80 500
01-01-1970 9:31:09 ASK 80 100
01-01-1970 9:31:09 TRADE 80 100
我想做三个计算:i) op_type == "ASK"
行的交易量的累计总和; ii) op_type == "BID"
行的交易量累计总和;和 iii) 前两卷的总和。
这很简单,但是op_type == "TRADE"
操作有一个条件:
只要有一个
TRADE
操作,其price
与BID
操作上的price
匹配,我想减去TRADE
操作量来自累计BID
量。只要有一个
TRADE
操作,其price
与ASK
操作上的price
匹配,我想减去TRADE
操作量从累计ASK
量。
我正在寻找的输出是:
>> df
date_time op_type price volume ASK_vol BID_vol BIDASK_vol
01-01-1970 9:30:01 ASK 100 1800 1800 0 1800
01-01-1970 9:30:25 ASK 90 1000 2800 0 2800
01-01-1970 9:30:28 BID 90 900 2800 900 3700
01-01-1970 9:30:28 TRADE 90 900 2800 0 2800
01-01-1970 9:31:01 BID 80 500 2800 500 3300
01-01-1970 9:31:09 ASK 80 100 2900 500 3400
01-01-1970 9:31:09 TRADE 80 100 2800 500 3300
我读过 this question 但我不确定如何将条件减法合并到该答案中。如果有任何帮助,我将不胜感激。谢谢。
IIUC,这就是你需要的。
a= np.where(df['op_type'] == 'ASK',df.volume,0)
b= np.where(df['op_type'] == 'BID',df.volume,0)
a_t = (np.where(df['op_type'] == 'TRADE',
(np.where(df['op_type'].shift(1) == 'ASK',
(np.where(df['volume']==df['volume'].shift(1),-df.volume,0)),0)),0))
b_t = (np.where(df['op_type'] == 'TRADE',
(np.where(df['op_type'].shift(1) == 'BID',
(np.where(df['volume']==df['volume'].shift(1),-df.volume,0)),0)),0))
df['ASK_vol']=(np.where(a_t!=0,a_t,a)).cumsum()
df['BID_vol']=(np.where(b_t!=0,b_t,b)).cumsum()
df['BIDASK_vol']= df['ASK_vol']+df['BID_vol']
输出
date_time op_type price volume ASK_vol BID_vol BIDASK_vol
01-01-1970 9:30:01 ASK 100 1800 1800 0 1800
01-01-1970 9:30:25 ASK 90 1000 2800 0 2800
01-01-1970 9:30:28 BID 90 900 2800 900 3700
01-01-1970 9:30:28 TRADE 90 900 2800 0 2800
01-01-1970 9:31:01 BID 80 500 2800 500 3300
01-01-1970 9:31:09 ASK 80 100 2900 500 3400
01-01-1970 9:31:09 TRADE 80 100 2800 500 3300