使用 seaborn 绘制 python 中 3 列的热图
Plotting heatmap for 3 columns in python with seaborn
v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86
在上面的数据框中,我想使用 v1 和 v2 作为 x 和 y 轴并使用 yy 作为值来绘制热图。我怎样才能在 python 中做到这一点?我试过 seaborn:
df = df.pivot('v1', 'v2', 'yy')
ax = sns.heatmap(df)
但是,这不起作用。还有其他解决方案吗?
您的数据具有重复值的问题,例如:
100.00 100.00 100.00
100.00 100.00 100.00
您必须删除重复的值,然后像这里那样旋转和绘图:
import seaborn as sns
import pandas as pd
# fill data
df = pd.read_clipboard()
df.drop_duplicates(['v1','v2'], inplace=True)
pivot = df.pivot(index='v1', columns='v2', values='yy')
ax = sns.heatmap(pivot,annot=True)
plt.show()
print (pivot)
枢轴:
v2 44.34 46.23 50.00 52.83 56.52 59.78 63.21 65.09 \
v1
0.00 NaN NaN NaN NaN 92.86 NaN NaN NaN
10.17 NaN NaN NaN 92.86 NaN NaN NaN NaN
15.25 100.0 NaN NaN NaN NaN NaN NaN NaN
23.73 NaN 92.86 NaN NaN NaN NaN NaN NaN
83.05 NaN NaN NaN NaN NaN 100.0 NaN NaN
96.61 NaN NaN NaN NaN NaN NaN NaN 100.0
100.00 NaN NaN 100.0 NaN NaN NaN 100.0 NaN
v2 68.87 75.47 79.35 100.00
v1
0.00 NaN NaN NaN NaN
10.17 NaN NaN NaN NaN
15.25 NaN NaN NaN NaN
23.73 NaN NaN NaN NaN
83.05 NaN NaN NaN NaN
96.61 NaN NaN NaN NaN
100.00 100.0 100.0 100.0 100.0
A seaborn heatmap
绘制分类数据。这意味着每个出现的值在热图中都将采用与任何其他值相同的 space,这与它们在数字上分开的距离无关。对于数字数据,这通常是不需要的。相反,可以选择以下技术之一。
Scatter
彩色散点图可能与热图一样好。点的颜色代表 yy
值。
ax.scatter(df.v1, df.v2, c=df.yy, cmap="copper")
u = u"""v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86"""
import pandas as pd
import matplotlib.pyplot as plt
import io
df = pd.read_csv(io.StringIO(u), delim_whitespace=True )
fig, ax = plt.subplots()
sc = ax.scatter(df.v1, df.v2, c=df.yy, cmap="copper")
fig.colorbar(sc, ax=ax)
ax.set_aspect("equal")
plt.show()
Hexbin
您可能需要查看 hexbin
。数据将显示在六边形箱中,并且数据聚合为每个箱内的平均值。这里的好处是,如果你选择大的网格大小,它看起来像一个散点图,而如果你把它设置得小,它看起来像一个热图,允许轻松地将图调整到所需的分辨率。
h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")
u = u"""v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86"""
import pandas as pd
import matplotlib.pyplot as plt
import io
df = pd.read_csv(io.StringIO(u), delim_whitespace=True )
fig, (ax, ax2) = plt.subplots(nrows=2)
h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")
fig.colorbar(h1, ax=ax)
fig.colorbar(h2, ax=ax2)
ax.set_aspect("equal")
ax2.set_aspect("equal")
ax.set_title("gridsize=100")
ax2.set_title("gridsize=10")
fig.subplots_adjust(hspace=0.3)
plt.show()
Tripcolor
A tripcolor
plot 可用于根据数据点获得图中的彩色区域,然后将其解释为三角形的边缘,根据边缘点的数据着色。这样的图需要有更多可用数据才能提供有意义的表示。
ax.tripcolor(df.v1, df.v2, df.yy, cmap="copper")
u = u"""v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86"""
import pandas as pd
import matplotlib.pyplot as plt
import io
df = pd.read_csv(io.StringIO(u), delim_whitespace=True )
fig, ax = plt.subplots()
tc = ax.tripcolor(df.v1, df.v2, df.yy, cmap="copper")
fig.colorbar(tc, ax=ax)
ax.set_aspect("equal")
ax.set_title("tripcolor")
plt.show()
请注意,如果整个网格中有更多数据点可用,tricontourf
图可能同样适用。
ax.tricontourf(df.v1, df.v2, df.yy, cmap="copper")
v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86
在上面的数据框中,我想使用 v1 和 v2 作为 x 和 y 轴并使用 yy 作为值来绘制热图。我怎样才能在 python 中做到这一点?我试过 seaborn:
df = df.pivot('v1', 'v2', 'yy')
ax = sns.heatmap(df)
但是,这不起作用。还有其他解决方案吗?
您的数据具有重复值的问题,例如:
100.00 100.00 100.00
100.00 100.00 100.00
您必须删除重复的值,然后像这里那样旋转和绘图:
import seaborn as sns
import pandas as pd
# fill data
df = pd.read_clipboard()
df.drop_duplicates(['v1','v2'], inplace=True)
pivot = df.pivot(index='v1', columns='v2', values='yy')
ax = sns.heatmap(pivot,annot=True)
plt.show()
print (pivot)
枢轴:
v2 44.34 46.23 50.00 52.83 56.52 59.78 63.21 65.09 \
v1
0.00 NaN NaN NaN NaN 92.86 NaN NaN NaN
10.17 NaN NaN NaN 92.86 NaN NaN NaN NaN
15.25 100.0 NaN NaN NaN NaN NaN NaN NaN
23.73 NaN 92.86 NaN NaN NaN NaN NaN NaN
83.05 NaN NaN NaN NaN NaN 100.0 NaN NaN
96.61 NaN NaN NaN NaN NaN NaN NaN 100.0
100.00 NaN NaN 100.0 NaN NaN NaN 100.0 NaN
v2 68.87 75.47 79.35 100.00
v1
0.00 NaN NaN NaN NaN
10.17 NaN NaN NaN NaN
15.25 NaN NaN NaN NaN
23.73 NaN NaN NaN NaN
83.05 NaN NaN NaN NaN
96.61 NaN NaN NaN NaN
100.00 100.0 100.0 100.0 100.0
A seaborn heatmap
绘制分类数据。这意味着每个出现的值在热图中都将采用与任何其他值相同的 space,这与它们在数字上分开的距离无关。对于数字数据,这通常是不需要的。相反,可以选择以下技术之一。
Scatter
彩色散点图可能与热图一样好。点的颜色代表 yy
值。
ax.scatter(df.v1, df.v2, c=df.yy, cmap="copper")
u = u"""v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86"""
import pandas as pd
import matplotlib.pyplot as plt
import io
df = pd.read_csv(io.StringIO(u), delim_whitespace=True )
fig, ax = plt.subplots()
sc = ax.scatter(df.v1, df.v2, c=df.yy, cmap="copper")
fig.colorbar(sc, ax=ax)
ax.set_aspect("equal")
plt.show()
Hexbin
您可能需要查看 hexbin
。数据将显示在六边形箱中,并且数据聚合为每个箱内的平均值。这里的好处是,如果你选择大的网格大小,它看起来像一个散点图,而如果你把它设置得小,它看起来像一个热图,允许轻松地将图调整到所需的分辨率。
h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")
u = u"""v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86"""
import pandas as pd
import matplotlib.pyplot as plt
import io
df = pd.read_csv(io.StringIO(u), delim_whitespace=True )
fig, (ax, ax2) = plt.subplots(nrows=2)
h1 = ax.hexbin(df.v1, df.v2, C=df.yy, gridsize=100, cmap="copper")
h2 = ax2.hexbin(df.v1, df.v2, C=df.yy, gridsize=10, cmap="copper")
fig.colorbar(h1, ax=ax)
fig.colorbar(h2, ax=ax2)
ax.set_aspect("equal")
ax2.set_aspect("equal")
ax.set_title("gridsize=100")
ax2.set_title("gridsize=10")
fig.subplots_adjust(hspace=0.3)
plt.show()
Tripcolor
A tripcolor
plot 可用于根据数据点获得图中的彩色区域,然后将其解释为三角形的边缘,根据边缘点的数据着色。这样的图需要有更多可用数据才能提供有意义的表示。
ax.tripcolor(df.v1, df.v2, df.yy, cmap="copper")
u = u"""v1 v2 yy
15.25 44.34 100.00
83.05 59.78 100.00
96.61 65.09 100.00
100.00 75.47 100.00
100.00 50.00 100.00
100.00 68.87 100.00
100.00 79.35 100.00
100.00 100.00 100.00
100.00 63.21 100.00
100.00 100.00 100.00
100.00 68.87 100.00
0.00 56.52 92.86
10.17 52.83 92.86
23.73 46.23 92.86"""
import pandas as pd
import matplotlib.pyplot as plt
import io
df = pd.read_csv(io.StringIO(u), delim_whitespace=True )
fig, ax = plt.subplots()
tc = ax.tripcolor(df.v1, df.v2, df.yy, cmap="copper")
fig.colorbar(tc, ax=ax)
ax.set_aspect("equal")
ax.set_title("tripcolor")
plt.show()
请注意,如果整个网格中有更多数据点可用,tricontourf
图可能同样适用。
ax.tricontourf(df.v1, df.v2, df.yy, cmap="copper")