填写数据框中缺失日期的缺失值

Fill in missing values for missing dates in dataframe

我有以下数据框:

df = pd.DataFrame(
    {
        'status': ['open', 'closed', 'open', 'closed', 'open', 'closed', 'open', 'closed'],
        'month': ['January 2020', 'January 2020', 'February 2020', 'February 2020', 'April 2020', 'April 2020', 'August 2020', 'August 2020'],
        'counts': [10, 12, 32, 12, 19, 40, 10, 11]
    }
)
    status  month           counts
0   open    January 2020    10
1   closed  January 2020    12
2   open    February 2020   32
3   closed  February 2020   12
4   open    April 2020      19
5   closed  April 2020      40
6   open    August 2020     10
7   closed  August 2020     11

我正在尝试使用 seaborn 获取堆积条形图:

sns.histplot(df, x='month', weights='counts', hue='status', multiple='stack')

目的是得到一个时间序列连续且不漏月的图。我如何用值填充缺失的行,以便数据框如下所示?

status  month           counts
open    January 2020    10
closed  January 2020    12
open    February 2020   32
closed  February 2020   12
open    March 2020      0
closed  March 2020      0
open    April 2020      19
closed  April 2020      40
open    May 2020        0
closed  May 2020        0
open    June 2020       0
closed  June 2020       0
open    July 2020       0
closed  July 2020       0
open    August 2020     10
closed  August 2020     11

您可以旋转数据框,然后用所需的月份重新编制索引。

import pandas as pd

df = pd.DataFrame({'status': ['open', 'closed', 'open', 'closed', 'open', 'closed', 'open', 'closed'],
                   'month': ['January 2020', 'January 2020', 'February 2020', 'February 2020', 'April 2020', 'April 2020', 'August 2020', 'August 2020'],
                   'counts': [10, 12, 32, 12, 19, 40, 10, 11]})

months = [f'{m} 2020' for m in ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August']]
df_pivoted = df.pivot(values='counts', index='month', columns='status').reindex(months).fillna(0)
ax = df_pivoted.plot.bar(stacked=True, width=1, ec='black', rot=0, figsize=(12, 5))

一个 seaborn 解决方案,可以使用 order=。这不适用于 histplot,仅适用于 barplot,它不会堆叠条形图。

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = pd.DataFrame({'status': ['open', 'closed', 'open', 'closed', 'open', 'closed', 'open', 'closed'],
                   'month': ['January 2020', 'January 2020', 'February 2020', 'February 2020', 'April 2020', 'April 2020', 'August 2020', 'August 2020'],
                   'counts': [10, 12, 32, 12, 19, 40, 10, 11]})

plt.figure(figsize=(12, 5))
months = [f'{m} 2020' for m in ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August']]
ax = sns.barplot(data=df, x='month', y='counts', hue='status', order=months)
plt.tight_layout()
plt.show()