Seaborn 堆叠直方图,包含来自多列的数据
Seaborn stacked histogram with data from multiple columns
如何使用 Seaborn 绘制多个堆叠直方图?我尝试了以下代码,但它引发了尺寸错误:ValueError: Length of list vectors must match length of data...
df = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9,10],
'val1': ['a','b',np.nan,np.nan,'a','a',np.nan,np.nan,np.nan,'b'],
'val2': [7,0.2,5,8,np.nan,1,0,np.nan,1,1],
'cat': ['yes','no','no','no','yes','yes','yes','yes','no','yes'],
})
display(df)
sns.histplot(data=df, y=['val1', 'val2'], hue='cat', multiple='stack')
想要的情节:
val1“否”频率 = 1 和“是”= 4
val2 “否” freq = 4 和“是” = 4
我认为您不能直接从当前数据框执行此操作。您需要获得一个数据框,其中一列为 val1/val2,另一列为 yes/no。
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9,10],
'val1': ['a','b',np.nan,np.nan,'a','a',np.nan,np.nan,np.nan,'b'],
'val2': [7,0.2,5,8,np.nan,1,0,np.nan,1,1],
'cat': ['yes','no','no','no','yes','yes','yes','yes','no','yes'],
})
val1 = df[['cat', 'val1']].dropna().drop(columns='val1')
val1['val'] = 'val1'
val2 = df[['cat', 'val2']].dropna().drop(columns='val2')
val2['val'] = 'val2'
plot_df = val1.append(val2).sort_values(by='cat')
sns.histplot(data=plot_df,x='val', stat='count', hue='cat', multiple='stack')
plt.show()
如何使用 Seaborn 绘制多个堆叠直方图?我尝试了以下代码,但它引发了尺寸错误:ValueError: Length of list vectors must match length of data...
df = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9,10],
'val1': ['a','b',np.nan,np.nan,'a','a',np.nan,np.nan,np.nan,'b'],
'val2': [7,0.2,5,8,np.nan,1,0,np.nan,1,1],
'cat': ['yes','no','no','no','yes','yes','yes','yes','no','yes'],
})
display(df)
sns.histplot(data=df, y=['val1', 'val2'], hue='cat', multiple='stack')
想要的情节:
val1“否”频率 = 1 和“是”= 4
val2 “否” freq = 4 和“是” = 4
我认为您不能直接从当前数据框执行此操作。您需要获得一个数据框,其中一列为 val1/val2,另一列为 yes/no。
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({'id': [1,2,3,4,5,6,7,8,9,10],
'val1': ['a','b',np.nan,np.nan,'a','a',np.nan,np.nan,np.nan,'b'],
'val2': [7,0.2,5,8,np.nan,1,0,np.nan,1,1],
'cat': ['yes','no','no','no','yes','yes','yes','yes','no','yes'],
})
val1 = df[['cat', 'val1']].dropna().drop(columns='val1')
val1['val'] = 'val1'
val2 = df[['cat', 'val2']].dropna().drop(columns='val2')
val2['val'] = 'val2'
plot_df = val1.append(val2).sort_values(by='cat')
sns.histplot(data=plot_df,x='val', stat='count', hue='cat', multiple='stack')
plt.show()