Pandas 数据框中的行求和返回 NAN

Question

我正在尝试对 Pandas 数据框中的每一行求和：

new_df['cash_change'] = new_df.sum(axis=0)

但是我的结果不断返回 NaN

我认为这可能与我将 positions 转换为 Decimal 进行乘法时有关：

pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])

我必须这样做才能将我的列相乘。但是我把它扔回去了：

cash_change[security.symbol] = cash_change[security.symbol].astype(float)

这是完整的方法。它的 objective 是对每个 security 执行一些列乘法，然后 sum 最后的总数：

def get_cash_change(self):
    """
    Calculate daily cash to be transacted every day. Cash change depends on
    the position (either buy or sell) multiplied by the adjusted closing price
    of the equity multiplied by the trade amount.
    :return:
    """
    cash_change = pd.DataFrame(index=self.positions.index)
    try:

        for security in self.market_on_close_securities:
            # First convert all the positions from floating-point to decimals
            pos_to_dec = np.array([Decimal(d) for d in security.signals['positions'].values])

            cash_change['positions'] = pos_to_dec
            cash_change['bars'] = security.bars['adj_close_price'].values

            # Perform calculation for cash change
            cash_change[security.symbol] = cash_change['positions'] * cash_change['bars'] * self.trade_amount

            cash_change[security.symbol] = cash_change[security.symbol].astype(float)

            # Clean up for next security
            cash_change.drop('positions', axis=1, inplace=True)
            cash_change.drop('bars', axis=1, inplace=True)

    except InvalidOperation as e :
        print("Invalid input : " + str(e))

    # Sum each equities change in cash
    new_df = cash_change.dropna()

    new_df['cash_change'] = new_df.sum(axis=0)

    return cash_change

我的 new_df Dataframe 最终看起来像这样：

                MTD       ESS      SIG       SNA  cash_change
price_date                                                   
2000-01-04      0.0      0.00     0.00      0.00          NaN
2000-01-05      0.0      0.00     0.00      0.00          NaN
2000-01-06      0.0      0.00     0.00      0.00          NaN
2000-01-07      0.0      0.00     0.00      0.00          NaN
2000-01-10      0.0      0.00     0.00      0.00          NaN
2000-01-11      0.0      0.00     0.00      0.00          NaN
2000-01-12      0.0      0.00     0.00      0.00          NaN
2000-01-13      0.0      0.00     0.00      0.00          NaN
2000-01-14      0.0      0.00     0.00      0.00          NaN
2000-01-18      0.0      0.00     0.00      0.00          NaN
2000-01-19      0.0      0.00     0.00      0.00          NaN
2000-01-20      0.0      0.00     0.00      0.00          NaN
2000-01-21      0.0      0.00     0.00      0.00          NaN
2000-01-24      0.0   1747.83  1446.71      0.00          NaN
2000-01-25   3419.0      0.00     0.00      0.00          NaN
2000-01-26      0.0      0.00     0.00   1660.38          NaN
2000-01-27      0.0      0.00 -1293.27      0.00          NaN
2000-01-28      0.0      0.00     0.00      0.00          NaN

对我做错了什么有什么建议吗？或者可能是另一种对每一行的列求和的方法？

Answer 1

当您在 DF.sum 方法中提供 axis=0 时，它会沿索引（如果更容易理解，则为垂直方向）执行求和。因此，您只会计算出与数据框的 4 列对应的 4 个值。然后，您将此结果分配给数据框的新列。由于它们不共享相同的索引轴以重新索引，因此您会得到一系列 NaN 元素。

您实际上想要跨列（水平方向）进行求和。

将该行更改为：

new_df['cash_change'] = new_df.sum(axis=1)  # sum row-wise across each column

现在您将得到有限的计算总和值。

Answer 2

new_df['cash_change'] = new_df.sum(axis=1)

Pandas 数据框中的行求和返回 NAN

Summing Rows in Pandas Dataframe returning NAN

numpy

nan

dataframe

python-2.7

pandas