运行 Python 在两个 DataFrame 列上运行

Run Python function over two DataFrame columns

我遇到了一个问题,我认为它应该很简单。 问题是我有一个函数,我想将它应用于我的数据框的两列。但是我收到一个错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

向您展示我正在尝试做的事情:

# Calculate the accuracy 
def mape(actual,pred):
  if actual == 0:
    if pred == 0:
      return 0
    else:
      return 100
  else:
    return np.mean(np.abs((actual - pred) / actual)) * 100

然后,我尝试将它应用于两列(称为 Actuals_March 和 Forecast_March)。

# This line runs into the ValueError above. 
# I removed all NaN values before running this. 
df['MAPE_Mar'] = df.apply(lambda x: mape(df.Actuals_March , df.Forecast_March), axis=1)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
#This is an snapshot of my data: 
df.Actuals_March       df.Forecast_March
          0.0     0.0
          0.0     0.0
          0.0     0.0
          4.0     0.0
          0.0     0.0
          5.0     0.0
         20.0     0.0
          0.0     0.0
          2.0     0.0
         13.0     0.0

希望你能帮助我。提前致谢

df 替换为 x 以按列匹配标量值:

df['MAPE_Mar'] = df.apply(lambda x: mape(x.Actuals_March , x.Forecast_March), axis=1)

矢量化备选方案:

m1 = df['Actuals_March'] == 0
m2 = df['Forecast_March'] == 0
s = (np.abs(df['Actuals_March'] - df['Forecast_March']) / df['Actuals_March']) * 100

df['MAPE_Mar1'] = np.select([m1 & m2, ~m1 & m2], [0, 100], s)