在多个图上绘制多列 groupby

Question

我有这样的数据

ID    value_y   date_x      end_cutoff
1      75     2020-7-1      2021-01-17
1      73     2020-7-2      2021-01-17
1      74     2020-7-1      2021-06-05
1      71     2020-7-2      2021-06-05
2      111    2020-7-1      2021-01-17
2      112    2020-7-2      2021-01-17
2      113    2020-7-1      2021-06-05
2      115    2020-7-2      2021-06-05

我想绘制以下数据以满足以下条件：

每个ID有1张图
每张图绘制了 n 条线（本例中为 2 条线；每条线 1 条 end_cutoff）

所以，理想情况下，在这个例子中，我会有两个独立的图，都有两条线。

目前这里是我的代码，但它把它们全部绘制在同一个图上，而不是为每个 ID 绘制一个新图。

 grouped = df_fit.groupby(['ID','end_cutoff'])
 fig, ax = plt.subplots()
 for (ID, end_cutoff), df_fit in grouped:
     ax.plot(df_fit['date_x'], df_fit['value_y'], label=ID+' '+str(end_cutoff.date()))
 plt.show()

Answer 1

此解决方案将缺失的部分添加到您现有的代码中

将日期列正确格式化为 datetime dtype，并仅提取日期部分。
创建数量等于唯一 'ID' 个值的子图
获取 ID 在 uid 中的索引，并使用该值索引并绘制到正确的 ax

此选项使用 pandas.DataFrame.plot
x轴的格式是'%m-%d %H'，因为点之间的时间很小。 x 轴将根据日期范围自动设置格式。

import pandas as pd
import numpy as np

# dataframe
data = {'ID': [1, 1, 1, 1, 2, 2, 2, 2], 'value_y': [75, 73, 74, 71, 111, 112, 113, 115], 'date_x': ['2020-7-1', '2020-7-2', '2020-7-1', '2020-7-2', '2020-7-1', '2020-7-2', '2020-7-1', '2020-7-2'], 'end_cutoff': ['2021-01-17', '2021-01-17', '2021-06-05', '2021-06-05', '2021-01-17', '2021-01-17', '2021-06-05', '2021-06-05']}
df = pd.DataFrame(data)

# set date columns to a datetime dtype and extract only the date component since time isn't relevant
df['end_cutoff'] =  pd.to_datetime(df['end_cutoff']).dt.date
df['date_x'] =  pd.to_datetime(df['date_x']).dt.date

# create grouped
grouped = df.groupby(['ID','end_cutoff'])

# create subplots based on the number of unique ID values
uid = df.ID.unique()
fig, ax = plt.subplots(nrows=len(uid), figsize=(7, 4))

for (ID, end_cutoff), df_fit in grouped:
    
    # get the index of the current ID, and use it to index ax
    axi = np.argwhere(uid==ID)[0][0]

    # plot to the correct ax based on the index of the ID
    df_fit.plot(x='date_x', y='value_y', ax=ax[axi], label=f'{ID} {end_cutoff}',
                xlabel='Date', ylabel='Value', title=f'ID: {ID}', marker='.', rot=30)

    # place the legend outside the plot
    ax[axi].legend(title='Cutoff', bbox_to_anchor=(1.05, 1), loc='upper left')

plt.tight_layout()
plt.show()

在多个图上绘制多列 groupby

Plotting multiple columns groupby on multiple plots

python

matplotlib

python-3.x

pandas

subplot