Pandas Pyplot：计算散点图的列数

Question

我有一个包含以下列的数据框：

df = pd.read_csv('edtech.csv')
print(df.head())

   Unnamed: 0                                         Title      Date Country  \
0           3     Apple acquires edtech company LearnSprout  15-01-16      US   
1           9  LearnLaunch Accelerator launches new program  15-01-16      US   
2          15                   Flex Class raises financing  15-01-16   India   
3          16               Grovo raises Series C financing  15-01-16      US   
4          17                    Myly raises seed financing  15-01-16   India   

                          Segment  
0             Tools for Educators  
1     Accelerators and Incubators  
2  Adult and Continuing Education  
3               Platforms and LMS  
4                     Mobile Apps  
>>>

现在，我想通过在一个轴上映射 'Country' 并在另一个轴上映射 'Segment' 来创建散点图。例如。对于美国和 'Tools for Educator'，图表上将有一个点。

如何转换此数据框，以便我有数字，我可以将其渲染成散点图？我可以通过计数在 Tableau 中获取图表，但不知道其背后的确切工作原理。

如果有人能帮助我，我将不胜感激。 TIA

Answer 1

我不知道是否存在使用两个非数值分类变量创建散点图的可能性，我能得到的最接近您想要的东西是使用 groupby 创建计数，重塑使用 pivot 的数据，并使用 seaborn:

创建 heatmap

import pandas as pd
import seaborn as sns

df = pd.read_csv('edtech.csv')
dd = df[['Country','Segment','Title']]
gg = dd.groupby(['Country','Segment'],as_index=False).count().rename(columns={"Title":"Number"})
gp = gg.pivot(columns="Segment",index="Country",values="Number").fillna(0.0)
sns.heatmap(gp,cbar=False)

Pandas Pyplot：计算散点图的列数

Pandas Pyplot: Counting columns for scatter plot

python

etl

matplotlib

pandas