Pandas 数据框中的行求和返回 NAN
Summing Rows in Pandas Dataframe returning NAN
我正在尝试对 Pandas 数据框中的每一行求和:
new_df['cash_change'] = new_df.sum(axis=0)
但是我的结果不断返回 NaN
我认为这可能与我将 positions 转换为 Decimal 进行乘法时有关:
pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])
我必须这样做才能将我的列相乘。但是我把它扔回去了:
cash_change[security.symbol] = cash_change[security.symbol].astype(float)
这是完整的方法。它的 objective 是对每个 security 执行一些列乘法,然后 sum 最后的总数:
def get_cash_change(self):
"""
Calculate daily cash to be transacted every day. Cash change depends on
the position (either buy or sell) multiplied by the adjusted closing price
of the equity multiplied by the trade amount.
:return:
"""
cash_change = pd.DataFrame(index=self.positions.index)
try:
for security in self.market_on_close_securities:
# First convert all the positions from floating-point to decimals
pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])
cash_change['positions'] = pos_to_dec
cash_change['bars'] = security.bars['adj_close_price'].values
# Perform calculation for cash change
cash_change[security.symbol] = cash_change['positions'] * cash_change['bars'] * self.trade_amount
cash_change[security.symbol] = cash_change[security.symbol].astype(float)
# Clean up for next security
cash_change.drop('positions', axis=1, inplace=True)
cash_change.drop('bars', axis=1, inplace=True)
except InvalidOperation as e :
print("Invalid input : " + str(e))
# Sum each equities change in cash
new_df = cash_change.dropna()
new_df['cash_change'] = new_df.sum(axis=0)
return cash_change
我的 new_df
Dataframe 最终看起来像这样:
MTD ESS SIG SNA cash_change
price_date
2000-01-04 0.0 0.00 0.00 0.00 NaN
2000-01-05 0.0 0.00 0.00 0.00 NaN
2000-01-06 0.0 0.00 0.00 0.00 NaN
2000-01-07 0.0 0.00 0.00 0.00 NaN
2000-01-10 0.0 0.00 0.00 0.00 NaN
2000-01-11 0.0 0.00 0.00 0.00 NaN
2000-01-12 0.0 0.00 0.00 0.00 NaN
2000-01-13 0.0 0.00 0.00 0.00 NaN
2000-01-14 0.0 0.00 0.00 0.00 NaN
2000-01-18 0.0 0.00 0.00 0.00 NaN
2000-01-19 0.0 0.00 0.00 0.00 NaN
2000-01-20 0.0 0.00 0.00 0.00 NaN
2000-01-21 0.0 0.00 0.00 0.00 NaN
2000-01-24 0.0 1747.83 1446.71 0.00 NaN
2000-01-25 3419.0 0.00 0.00 0.00 NaN
2000-01-26 0.0 0.00 0.00 1660.38 NaN
2000-01-27 0.0 0.00 -1293.27 0.00 NaN
2000-01-28 0.0 0.00 0.00 0.00 NaN
对我做错了什么有什么建议吗?或者可能是另一种对每一行的列求和的方法?
当您在 DF.sum
方法中提供 axis=0
时,它会沿索引(如果更容易理解,则为垂直方向)执行求和。因此,您只会计算出与数据框的 4 列对应的 4 个值。然后,您将此结果分配给数据框的新列。由于它们不共享相同的索引轴以重新索引,因此您会得到一系列 NaN
元素。
您实际上想要跨列(水平方向)进行求和。
将该行更改为:
new_df['cash_change'] = new_df.sum(axis=1) # sum row-wise across each column
现在您将得到有限的计算总和值。
new_df['cash_change'] = new_df.sum(axis=1)
我正在尝试对 Pandas 数据框中的每一行求和:
new_df['cash_change'] = new_df.sum(axis=0)
但是我的结果不断返回 NaN
我认为这可能与我将 positions 转换为 Decimal 进行乘法时有关:
pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])
我必须这样做才能将我的列相乘。但是我把它扔回去了:
cash_change[security.symbol] = cash_change[security.symbol].astype(float)
这是完整的方法。它的 objective 是对每个 security 执行一些列乘法,然后 sum 最后的总数:
def get_cash_change(self):
"""
Calculate daily cash to be transacted every day. Cash change depends on
the position (either buy or sell) multiplied by the adjusted closing price
of the equity multiplied by the trade amount.
:return:
"""
cash_change = pd.DataFrame(index=self.positions.index)
try:
for security in self.market_on_close_securities:
# First convert all the positions from floating-point to decimals
pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])
cash_change['positions'] = pos_to_dec
cash_change['bars'] = security.bars['adj_close_price'].values
# Perform calculation for cash change
cash_change[security.symbol] = cash_change['positions'] * cash_change['bars'] * self.trade_amount
cash_change[security.symbol] = cash_change[security.symbol].astype(float)
# Clean up for next security
cash_change.drop('positions', axis=1, inplace=True)
cash_change.drop('bars', axis=1, inplace=True)
except InvalidOperation as e :
print("Invalid input : " + str(e))
# Sum each equities change in cash
new_df = cash_change.dropna()
new_df['cash_change'] = new_df.sum(axis=0)
return cash_change
我的 new_df
Dataframe 最终看起来像这样:
MTD ESS SIG SNA cash_change
price_date
2000-01-04 0.0 0.00 0.00 0.00 NaN
2000-01-05 0.0 0.00 0.00 0.00 NaN
2000-01-06 0.0 0.00 0.00 0.00 NaN
2000-01-07 0.0 0.00 0.00 0.00 NaN
2000-01-10 0.0 0.00 0.00 0.00 NaN
2000-01-11 0.0 0.00 0.00 0.00 NaN
2000-01-12 0.0 0.00 0.00 0.00 NaN
2000-01-13 0.0 0.00 0.00 0.00 NaN
2000-01-14 0.0 0.00 0.00 0.00 NaN
2000-01-18 0.0 0.00 0.00 0.00 NaN
2000-01-19 0.0 0.00 0.00 0.00 NaN
2000-01-20 0.0 0.00 0.00 0.00 NaN
2000-01-21 0.0 0.00 0.00 0.00 NaN
2000-01-24 0.0 1747.83 1446.71 0.00 NaN
2000-01-25 3419.0 0.00 0.00 0.00 NaN
2000-01-26 0.0 0.00 0.00 1660.38 NaN
2000-01-27 0.0 0.00 -1293.27 0.00 NaN
2000-01-28 0.0 0.00 0.00 0.00 NaN
对我做错了什么有什么建议吗?或者可能是另一种对每一行的列求和的方法?
当您在 DF.sum
方法中提供 axis=0
时,它会沿索引(如果更容易理解,则为垂直方向)执行求和。因此,您只会计算出与数据框的 4 列对应的 4 个值。然后,您将此结果分配给数据框的新列。由于它们不共享相同的索引轴以重新索引,因此您会得到一系列 NaN
元素。
您实际上想要跨列(水平方向)进行求和。
将该行更改为:
new_df['cash_change'] = new_df.sum(axis=1) # sum row-wise across each column
现在您将得到有限的计算总和值。
new_df['cash_change'] = new_df.sum(axis=1)