在重叠散点图中，如何优先考虑特定数据？

Question

我的代码是这样的，目前：

df = pd.read_csv("Table.csv")
x=df['Fe']
y=df['V']
z=df['HIP']  #here, is a column of strings

rect_scatter = [left, bottom, width, height]
fig=plt.figure(figsize=(10, 8))
ax_scatter = plt.axes(rect_scatter)
ax_scatter.tick_params(direction='in', top=True, right=True)


# the function that separates the dots in different classes:
classes = np.zeros( len(x) )
classes[(z == 'KOI-2')]= 1
classes[(z == 'KOI-10')]= 1
classes[(z == 'KOI-17')]= 1
classes[(z == 'KOI-18')]= 1
classes[(z == 'KOI-22')]= 1
classes[(z == 'KOI-94')]= 1
classes[(z == 'KOI-97')]= 1


# create color map:
colors = ['green', 'red']
cm = LinearSegmentedColormap.from_list('custom', colors, N=len(colors))


# the scatter plot:
scatter = ax_scatter.scatter(x, y, c=classes, s=10, cmap=cm)
lines, labels = scatter.legend_elements()

# legend with custom labels
labels = [r'Hypatia', r'CKS']
legend = ax_scatter.legend(lines, labels,
                    loc="upper left", title="Stars with giant planets")
ax_scatter.add_artist(legend)


ax_scatter.set_xlabel('[Fe/H]')
ax_scatter.set_ylabel('[V/H]')

但是，我的数据除了我设置为 classes=1 的这 7 个值之外还有很多值。因此，当我绘制散点图时，这 3 个值与其他数百个值重叠。我怎样才能让这 7 个点出现在情节中其他点的前面？有没有办法让 class 优先于另一个？

Answer 1

在你的情况下，在绘图之前划分数据更简单，然后调用 ax.scatter 两次。默认情况下，最后一次调用将具有 Z-index 优先级。

如果无法访问您的数据，我无法正确测试它，但像这样的东西应该可以工作：

class_one_strings = ['KOI-2', 'KOI-10', 'KOI-17', 'KOI-18', 'KOI-22', 'KOI-94', 'KOI-97']

df['Classes'] = df['HIP'].apply(lambda s: 1 if s in class_one_strings else 0)

class_zero_x = df.loc[df['Classes'] == 0]['Fe']
class_zero_y = df.loc[df['Classes'] == 0]['V']

class_one_x = df.loc[df['Classes'] == 1]['Fe']
class_one_y = df.loc[df['Classes'] == 1]['V']

ax_scatter.scatter(class_zero_x, class_zero_y, c='green', s=10)
ax_scatter.scatter(class_one_x, class_one_y, c='red', s=10)

Answer 2

除了 jfaccionis anwer 之外，您还可以使用参数 Zorder 显式设置绘图顺序。见 docs.

对于每个 scatter-命令，您可以指定其顺序：

ax.scatter(x, y, s=12, zorder=2)

在重叠散点图中，如何优先考虑特定数据？

In an overlapping scatter plot, how to give preference to a specific data?

python

matplotlib

scatter-plot

pandas