Matplotlib / Seaborn 散点图不使用正确的标记和颜色条修复

Matplotlib / Seaborn scatterplot dosen't use correct markers and colorbar fix

我正在尝试可视化 UMAP 并且已经有了以下代码和绘图,我的目标是为我的数据集中的两个 classes 设置两个不同的标记,并且为我在其中的每个组设置一个颜色我的数据集(组是 VP XXX,请参见图像中的颜色条)实际上已经以某种方式解决了。

问题是标记不是我想要的标记,而且颜色条在告诉我哪个颜色是哪个组时不是很准确。

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

#### prepare lists ####
VP_with_col = []
m=[]
col = []
embedding = [[0,0.8],[0.5,0.5], [0.9,0.5],[0.2,0.9],[0.4,0.4],[0.6,0.5],[0.77,0.59],[0.8,0.1]]
EXAMPLE_VP = ["VP124","VP124","VP125", "VP125", "VP203", "VP203","VP258","VP258"]
EXAMPLE_LABELS = [0,1,0,1,0,1,0,1]
dataframe = pd.DataFrame({"VP": EXAMPLE_VP, "label": EXAMPLE_LABELS})
VP_list = dataframe.VP.unique()
# add color/value to each unique VP
for idx,vp in enumerate(VP_list): 
    VP_with_col.append([1+idx, vp]) #somehow this gives me a different color for each group which is great

#create color array of length len(dataframe.VP) with a color for each group
for idx, vp in enumerate(dataframe.VP):
    for vp_col in VP_with_col:
        if(vp_col[1] == vp):
         col.append(vp_col[0])   

#### create marker list ####
for elem in dataframe.label:
  if(elem == 0):
        m.append("o")
  else:
        m.append("^")

########################## relevant part for question ############################

#### create plot dataframe from lists and UMAP embedding ####
plot_df = pd.DataFrame(data={"x":embedding[:,0], "y": embedding[:,1], "color":col, "marker": m })

plt.style.use("seaborn")   
plt.figure()

#### Plot ####
ax= sns.scatterplot(data=plot_df, x="x",y="y",style= "marker" , c= col, cmap='Spectral', s=5 )
ax.set(xlabel = None, ylabel = None)
plt.gca().set_aspect('equal', 'datalim')

#### Colorbar ####
norm = plt.Normalize(min(col), max(col))
sm = plt.cm.ScalarMappable(cmap="Spectral", norm=norm)
sm.set_array([])

# Remove the legend(marker legend) , add colorbar
ax.get_legend().remove()
cb = ax.figure.colorbar(sm)  

cb.set_ticks(np.arange(len(VP_list)))
cb.set_ticklabels(VP_list)

##### save
plt.title('UMAP projection of feature space', fontsize=12) 
plt.savefig("./umap_plot",dpi=1200)

给我这个带有标准标记和 'x' 标记的图。在 style = "marker" 中,数据框的标记列类似于 ["^", "o","^","^","^","o"...]:

是否也可以更清楚地显示颜色栏中哪种颜色属于哪个class?

你正在做很多没有 Seaborn 的 matplotlib 需要的操作。使用 Seaborn,大部分是自动完成的。这是您的测试数据的样子:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

embedding = np.array([[0, 0.8], [0.5, 0.5], [0.9, 0.5], [0.2, 0.9], [0.4, 0.4], [0.6, 0.5], [0.77, 0.59], [0.8, 0.1]])
EXAMPLE_VP = ["VP124", "VP124", "VP125", "VP125", "VP203", "VP203", "VP258", "VP258"]
EXAMPLE_LABELS = [0, 1, 0, 1, 0, 1, 0, 1]
plot_df = pd.DataFrame({"x": embedding[:, 0], "y": embedding[:, 1], "VP": EXAMPLE_VP, "label": EXAMPLE_LABELS})

plt.figure()
plt.style.use("seaborn")

ax = sns.scatterplot(data=plot_df, x="x", y="y",
                     hue='VP', palette='Spectral',
                     style="label", markers=['^', 'o'], s=100)
ax.set(xlabel=None, ylabel=None)
ax.set_aspect('equal', 'datalim')
# sns.move_legend(ax, bbox_to_anchor=(1.01, 1.01), loc='upper left')

plt.tight_layout()
plt.show()

请注意,'Spectral' 颜色图将浅黄色分配给 'VP203',这在默认背景下很难看清。您可能想使用例如palette='Set2' 颜色。