缺失值(NaN 值)与填充值的重叠图
overlap graph of missing values (NaN values) with filled values
我有以下包含两列的 Panda DataFrame。第一列是包含缺失值(NaN 值)的原始值,第二列是缺失插补的结果,用于填充第一列中的 NaN 值。我如何在同一个图表中绘制这两列,以显示具有填充值的原始值,如下图所示:
Data=pd.DataFrame([[3.83092724, np.nan],
[ np.nan, 3.94103207],
[ np.nan, 3.86621724],
[3.48386179, np.nan],
[ np.nan, 3.7430167 ],
[3.2382959 , np.nan],
[3.9143139 , np.nan],
[4.46676265, np.nan],
[ np.nan, 3.9340262 ],
[3.650658 , np.nan],
[ np.nan, 3.10590516],
[4.19497691, np.nan],
[4.11873876, np.nan],
[4.15286075, np.nan],
[4.67441617, np.nan],
[4.50631534, np.nan],
[ np.nan, 4.01349688],
[ np.nan, 3.48459778],
[ np.nan, 3.83495488],
[ np.nan, 3.10590516],
[ np.nan, 4.09355884],
[4.8433281 , np.nan],
[ np.nan, 3.33450675],
[4.86672126, np.nan],
[ np.nan, 3.2382959 ],
[ np.nan, 3.48210011],
[ np.nan, 3.00958811],
[ np.nan, 3.05774663]], columns=['original', 'filled'])
您需要标记,否则如果您的单个原始值被缺失值包围,图表将毫无意义。
我们首先绘制原始值。然后,对于填充值,我们用原始值填充与现有填充值直接相邻的任何缺失值,以获得从该原始值到 next/preceding 填充值的虚线。最后,我们将这些修改后的填充值列绘制为虚线。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df=pd.DataFrame([[3.83092724, np.nan],
[ np.nan, 3.94103207],
[ np.nan, 3.86621724],
[3.48386179, np.nan],
[ np.nan, 3.7430167 ],
[3.2382959 , np.nan],
[3.9143139 , np.nan],
[4.46676265, np.nan],
[ np.nan, 3.9340262 ],
[3.650658 , np.nan],
[ np.nan, 3.10590516],
[4.19497691, np.nan],
[4.11873876, np.nan],
[4.15286075, np.nan],
[4.67441617, np.nan],
[4.50631534, np.nan],
[ np.nan, 4.01349688],
[ np.nan, 3.48459778],
[ np.nan, 3.83495488],
[ np.nan, 3.10590516],
[ np.nan, 4.09355884],
[4.8433281 , np.nan],
[ np.nan, 3.33450675],
[4.86672126, np.nan],
[ np.nan, 3.2382959 ],
[ np.nan, 3.48210011],
[ np.nan, 3.00958811],
[ np.nan, 3.05774663]], columns=['original', 'filled'])
_,ax = plt.subplots()
df.original.plot(marker='o', ax=ax)
m = (df.filled.isna()&df.filled.shift(1).notna()) | (df.filled.isna()&df.filled.shift(-1).notna())
df.filled.fillna(df.loc[m,'original']).plot(ls='--', ax=ax, color=ax.get_lines()[0].get_color())
以上是一般情况下的干净解决方案。如果原始值是用实心不透明线绘制的,而填充值的线宽不大于原始值,则可以简单地先绘制完全填充的填充值,然后在该线的顶部绘制原始值价值观:
df.filled.fillna(df.original).plot(ax=ax, color='blue', ls='--')
df.original.plot(marker='o', ax=ax, color='blue')
我有以下包含两列的 Panda DataFrame。第一列是包含缺失值(NaN 值)的原始值,第二列是缺失插补的结果,用于填充第一列中的 NaN 值。我如何在同一个图表中绘制这两列,以显示具有填充值的原始值,如下图所示:
Data=pd.DataFrame([[3.83092724, np.nan],
[ np.nan, 3.94103207],
[ np.nan, 3.86621724],
[3.48386179, np.nan],
[ np.nan, 3.7430167 ],
[3.2382959 , np.nan],
[3.9143139 , np.nan],
[4.46676265, np.nan],
[ np.nan, 3.9340262 ],
[3.650658 , np.nan],
[ np.nan, 3.10590516],
[4.19497691, np.nan],
[4.11873876, np.nan],
[4.15286075, np.nan],
[4.67441617, np.nan],
[4.50631534, np.nan],
[ np.nan, 4.01349688],
[ np.nan, 3.48459778],
[ np.nan, 3.83495488],
[ np.nan, 3.10590516],
[ np.nan, 4.09355884],
[4.8433281 , np.nan],
[ np.nan, 3.33450675],
[4.86672126, np.nan],
[ np.nan, 3.2382959 ],
[ np.nan, 3.48210011],
[ np.nan, 3.00958811],
[ np.nan, 3.05774663]], columns=['original', 'filled'])
您需要标记,否则如果您的单个原始值被缺失值包围,图表将毫无意义。
我们首先绘制原始值。然后,对于填充值,我们用原始值填充与现有填充值直接相邻的任何缺失值,以获得从该原始值到 next/preceding 填充值的虚线。最后,我们将这些修改后的填充值列绘制为虚线。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df=pd.DataFrame([[3.83092724, np.nan],
[ np.nan, 3.94103207],
[ np.nan, 3.86621724],
[3.48386179, np.nan],
[ np.nan, 3.7430167 ],
[3.2382959 , np.nan],
[3.9143139 , np.nan],
[4.46676265, np.nan],
[ np.nan, 3.9340262 ],
[3.650658 , np.nan],
[ np.nan, 3.10590516],
[4.19497691, np.nan],
[4.11873876, np.nan],
[4.15286075, np.nan],
[4.67441617, np.nan],
[4.50631534, np.nan],
[ np.nan, 4.01349688],
[ np.nan, 3.48459778],
[ np.nan, 3.83495488],
[ np.nan, 3.10590516],
[ np.nan, 4.09355884],
[4.8433281 , np.nan],
[ np.nan, 3.33450675],
[4.86672126, np.nan],
[ np.nan, 3.2382959 ],
[ np.nan, 3.48210011],
[ np.nan, 3.00958811],
[ np.nan, 3.05774663]], columns=['original', 'filled'])
_,ax = plt.subplots()
df.original.plot(marker='o', ax=ax)
m = (df.filled.isna()&df.filled.shift(1).notna()) | (df.filled.isna()&df.filled.shift(-1).notna())
df.filled.fillna(df.loc[m,'original']).plot(ls='--', ax=ax, color=ax.get_lines()[0].get_color())
以上是一般情况下的干净解决方案。如果原始值是用实心不透明线绘制的,而填充值的线宽不大于原始值,则可以简单地先绘制完全填充的填充值,然后在该线的顶部绘制原始值价值观:
df.filled.fillna(df.original).plot(ax=ax, color='blue', ls='--')
df.original.plot(marker='o', ax=ax, color='blue')