如何在图表中绘制 pandas groupby 值
How to plot pandas groupby values in a graph
我有一个 csv 文件,其中包含性别和婚姻状况以及如下所示的其他几列。
Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
LP001002,Male,No,0,Graduate,No,5849,0,,360,1,Urban,Y
LP001003,Male,Yes,1,Graduate,No,4583,1508,128,360,1,Rural,N
LP001005,Male,Yes,0,Graduate,Yes,3000,0,66,360,1,Urban,Y
LP001006,Male,Yes,0,Not Graduate,No,2583,2358,120,360,1,Urban,Y
LP001008,Male,No,0,Graduate,No,6000,0,141,360,1,Urban,Y
LP001011,Male,Yes,2,Graduate,Yes,5417,4196,267,360,1,Urban,Y
我要数数。已婚男性和女性的比例如下图所示
下面是我使用的代码:
import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
if __name__ == '__main__':
x=[]
y=[]
df = pd.read_csv(
"/home/train.csv",usecols=[1,2]).dropna(subset=['Gender','Married']) # Reading the dataset in a dataframe using Pandas
groups = df.groupby(['Gender','Married'])['Married'].apply(lambda x: x.count())
print(groups)
分组后我得到以下结果:
Gender Married
Female No 80
Yes 31
Male No 130
Yes 357
我想要下面的图表
您可以使用 groupby
+ size
and then use Series.plot.bar
:
.
groups = df.groupby(['Gender','Married']).size()
groups.plot.bar()
另一个解决方案是添加 unstack
for reshape or crosstab
:
print (df.groupby(['Gender','Married']).size().unstack(fill_value=0))
Married No Yes
Gender
Female 80 31
Male 130 357
df.groupby(['Gender','Married']).size().unstack(fill_value=0).plot.bar()
或者:
pd.crosstab(df['Gender'],df['Married']).plot.bar()
我有一个 csv 文件,其中包含性别和婚姻状况以及如下所示的其他几列。
Loan_ID,Gender,Married,Dependents,Education,Self_Employed,ApplicantIncome,CoapplicantIncome,LoanAmount,Loan_Amount_Term,Credit_History,Property_Area,Loan_Status
LP001002,Male,No,0,Graduate,No,5849,0,,360,1,Urban,Y
LP001003,Male,Yes,1,Graduate,No,4583,1508,128,360,1,Rural,N
LP001005,Male,Yes,0,Graduate,Yes,3000,0,66,360,1,Urban,Y
LP001006,Male,Yes,0,Not Graduate,No,2583,2358,120,360,1,Urban,Y
LP001008,Male,No,0,Graduate,No,6000,0,141,360,1,Urban,Y
LP001011,Male,Yes,2,Graduate,Yes,5417,4196,267,360,1,Urban,Y
我要数数。已婚男性和女性的比例如下图所示
下面是我使用的代码:
import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
if __name__ == '__main__':
x=[]
y=[]
df = pd.read_csv(
"/home/train.csv",usecols=[1,2]).dropna(subset=['Gender','Married']) # Reading the dataset in a dataframe using Pandas
groups = df.groupby(['Gender','Married'])['Married'].apply(lambda x: x.count())
print(groups)
分组后我得到以下结果:
Gender Married
Female No 80
Yes 31
Male No 130
Yes 357
我想要下面的图表
您可以使用 groupby
+ size
and then use Series.plot.bar
:
groups = df.groupby(['Gender','Married']).size()
groups.plot.bar()
另一个解决方案是添加 unstack
for reshape or crosstab
:
print (df.groupby(['Gender','Married']).size().unstack(fill_value=0))
Married No Yes
Gender
Female 80 31
Male 130 357
df.groupby(['Gender','Married']).size().unstack(fill_value=0).plot.bar()
或者:
pd.crosstab(df['Gender'],df['Married']).plot.bar()