如何使用 pandas 数据框在 python 3 中使用断言?
How to use assertions in python 3 using a pandas Data Frame?
如果我在 Pandas Dataframe 中有两列,我想执行一个断言,看看它们是否等于或大于,或者对这两列进行一些其他逻辑布尔测试。
现在我正在做这样的事情:
# Roll the fields up so we can compare both reports.
# Goal: Show that `Gross Sales per Bar` is equal to `Gross Sales per Category`
#
# Do a GROUP BY of all the service bars and sum their Gross Sales per Bar
# Since the same value should be in this field for every 'Gross Sales per Bar' field,
# grab the first one, so we can compare them below
df_bar_sum = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Bar'].first()
df_bar_sum2 = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Category'].sum()
# Rename the 'Gross Sales per Category' column to 'Summed Gross Sales per Category'
df_bar_sum2.rename(columns={'Gross Sales per Category':'Summed Gross Sales per Category'}, inplace=True)
# Add the 'Gross Sales per Bar' column to the df_bar_sum2 Data Frame.
df_bar_sum2['Gross Sales per Bar'] = df_bar_sum['Gross Sales per Bar']
# See if they match...they should since the value of 'Gross Sales per Bar' should be equal to 'Gross Sales per Category' summed.
df_bar_sum2['GrossSalesPerCat_GrossSalesPerBar_eq'] = df_bar_sum2.apply(lambda row: 1 if row['Summed Gross Sales per Category'] == row['Gross Sales per Bar'] else 0, axis=1)
# Print the result
df_bar_sum2
我最后得到一个列,如果匹配则等于 1
,如果不匹配则等于 0
。
我想在这里使用断言来测试它们是否匹配,因为如果它们不匹配并显示某种错误,那么在进行测试时会导致整个事情变得糟糕;对于表格数据,这可能不是一个好方法,我不确定,但如果这是一个好主意,我宁愿使用断言来比较它们。
断言也可能更难阅读,这会很糟糕,我不确定...
assert np.allclose(your_df['Summed Gross Sales per Category'],
your_df['Gross Sales per Bar'])
如果我在 Pandas Dataframe 中有两列,我想执行一个断言,看看它们是否等于或大于,或者对这两列进行一些其他逻辑布尔测试。
现在我正在做这样的事情:
# Roll the fields up so we can compare both reports.
# Goal: Show that `Gross Sales per Bar` is equal to `Gross Sales per Category`
#
# Do a GROUP BY of all the service bars and sum their Gross Sales per Bar
# Since the same value should be in this field for every 'Gross Sales per Bar' field,
# grab the first one, so we can compare them below
df_bar_sum = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Bar'].first()
df_bar_sum2 = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Category'].sum()
# Rename the 'Gross Sales per Category' column to 'Summed Gross Sales per Category'
df_bar_sum2.rename(columns={'Gross Sales per Category':'Summed Gross Sales per Category'}, inplace=True)
# Add the 'Gross Sales per Bar' column to the df_bar_sum2 Data Frame.
df_bar_sum2['Gross Sales per Bar'] = df_bar_sum['Gross Sales per Bar']
# See if they match...they should since the value of 'Gross Sales per Bar' should be equal to 'Gross Sales per Category' summed.
df_bar_sum2['GrossSalesPerCat_GrossSalesPerBar_eq'] = df_bar_sum2.apply(lambda row: 1 if row['Summed Gross Sales per Category'] == row['Gross Sales per Bar'] else 0, axis=1)
# Print the result
df_bar_sum2
我最后得到一个列,如果匹配则等于 1
,如果不匹配则等于 0
。
我想在这里使用断言来测试它们是否匹配,因为如果它们不匹配并显示某种错误,那么在进行测试时会导致整个事情变得糟糕;对于表格数据,这可能不是一个好方法,我不确定,但如果这是一个好主意,我宁愿使用断言来比较它们。
断言也可能更难阅读,这会很糟糕,我不确定...
assert np.allclose(your_df['Summed Gross Sales per Category'],
your_df['Gross Sales per Bar'])