将近散点合并为一个并增加其大小

Combine near scatter points into one and increase its size

我有 3 个变量 RZPRPTC。我用不断变化的颜色绘制了它们之间的散点图。 RZ 在 0-800 之间变化,PRP 在 0-4000 之间变化,TC 在 0-100 之间变化。代码及下图:

fig = plt.figure(figsize=(12, 10))
points = plt.scatter(RZS_P.PRP, RZS_P.RZ, c=RZS_P.TC, cmap="Spectral", 
lw=1, s = 60 , vmax = 100, vmin =0, alpha = 0.7, edgecolors= 'b')
plt.colorbar(points)

我想做的是将邻域中一些相同的外观点与 PRP (± 250)、RZ (± 50) 和 TC (± 5) [或类似的东西] 组合为一个点,并且增加它的大小。这将提供比下面的散点图更好的可视化效果。 基本上我想要实现的是通过取平均值然后绘制它来将具有接近相似值(或范围或区间内的值)的散点组合起来。

下面提到了我想出的一些代码(虽然这个代码只在重叠时增加散点的大小并且不考虑邻居):

# First defining a list with specifications as
data_dict = {250: np.array(RZS_P['RZ'][RZS_P.PPR < 250]),
             500:np.array(RZS_P['RZ'][(RZS_P.PRP > 250) & (RZS_P.PRP < 500)]),
             ....................
             4000:np.array(RZS_P['RZ'][(RZS_P.PRP > 3750) & (RZS_P.PRP< 4000)])}
size_constant = 20

for xe, ye in data_dict.items():
    xAxis = [xe] * len(ye)

    #square it to amplify the effect, if you do ye.count(num)*size_constant the effect is barely noticeable
    sizes = [ye.tolist().count(num)**3 * size_constant for num in ye]
    plt.scatter(xAxis, ye, s=sizes)
plt.show()

我理想中的身材应该是这样的: 有人可以帮我解决这个问题吗?

附加信息: 有关动态编码的更多信息

### Divide the dataset into categories first and then plot
P_range = np.arange(0,4000,500); RZ_range = np.arange(0,1000,100); TC_range = np.arange(0,100,10)

i = 0; j = 0; k = 0; 
RZS_P[(RZS_P.P_2001 >= P_range[i]-250) & (RZS_P.P_2001 < P_range[i]+250) & (RZS_P.Rootzone >= RZ_range[j]-50) & 
      (RZS_P.Rootzone < RZ_range[j]+50) & (RZS_P.Treecover >= TC_range[k]-5) & (RZS_P.Treecover < TC_range[k]+5)].describe()
[Output]:
        RZ          PRP         TC  
count   1.000000    1.000000    1.000000    
mean    43.614338   220.068451  2.179487    
std      NaN        NaN         NaN         
### For above, I want my scatter point to remain same

i = 0; j = 1; k = 0; 
[Output]:
        RZ          PRP         TC  
count   28.000000   28.000000   28.000000   
mean    104.511887  124.827377  1.982593    
std      29.474167  62.730640   0.977752    
## For this subset I want my scatter point to have a size of 29 and 62 (as std) on x and 
## y-axis, respectively (so basically an oval) with centre at 104 and 124 (as mean) on x and y respectively. 
## Since the count is 28, I want my scatter point to be relatively bigger than 
## previous (based on this count throughout the analysis). The values of mean TC 
## would be used as the colour axis (same as Fig. 1).

最接近我的目标:

P_range = np.arange(0,4000,200); RZ_range = np.arange(0,1000,50); TC_range = np.arange(0,110,10)

x = []; y = []; z = []; height = []; width = []; size = [] 
for i in range(P_range.shape[0]):
    for j in range(RZ_range.shape[0]):
        for k in range(TC_range.shape[0]):
            stats = RZS_P[(RZS_P.PRP>= P_range[i]-100) & (RZS_P.PRP< P_range[i]+100) & (RZS_P.RZ>= RZ_range[j]-25) & 
                          (RZS_P.RZ< RZ_range[j]+25) & (RZS_P.TC>= TC_range[k]-5) & (RZS_P.TC< TC_range[k]+5)].describe()
            x.append(stats.to_numpy()[1,1]) 
            y.append(stats.to_numpy()[1,0])
            z.append(stats.to_numpy()[1,2])
            width.append(stats.to_numpy()[2,1])
            height.append(stats.to_numpy()[2,0])
            size.append(stats.to_numpy()[0,0])

final_scatters = pd.DataFrame({'PRP': x, 'RZ': y, 'TC': z, 'height': height, 'width': width, 'size': size})
#final_scatters looks like this
    PRP         RZ          TC           height      width      size
22  84.423500   91.315781   2.492503    17.500629   18.499458   2.0
33  61.671188   137.650848  1.305071    18.169079   20.138525   6.0
143 53.673630   634.536926  3.443243    1.000000    1.000000    1.0
231 202.459641  62.480145   2.156926    8.962382    46.061661   21.0
242 217.588333  98.111694   2.011893    15.964933   59.468643   20.0
....................................................................

fig = plt.figure(figsize=(12, 10))

points = plt.scatter(final_scatters.PRP, final_scatters.RZ, c=final_scatters.TC, cmap="Spectral",
                     s = final_scatters['size']*40, vmax = 100, vmin =0, alpha = 0.9, edgecolors= 'black')
plt.colorbar(points)

现在我正在做椭圆的跟随,但是得到一个空框:

ells = [Ellipse(xy = np.array([np.array(final_scatters)[i,0], np.array(final_scatters)[i,1]]), width=np.array(final_scatters)[i,4], 
                height=np.array(final_scatters)[i,3]) for i in range(len(final_scatters))]
fig = plt.figure(0)
ax = fig.add_subplot(111)
for e in ells:
    ax.add_artist(e)
    e.set_clip_box(ax.bbox)
    e.set_alpha(rnd.rand())
    e.set_facecolor(rnd.rand(3))

如果您的 final_scatters 数据框结构良好,其中一行由预期的椭圆组成:

final_scatters = pd.DataFrame({'PRP': x, 'RZ': y, 'TC': z, 'height': height, 'width': width, 'size': size})
#final_scatters looks like this
    PRP         RZ          TC           height      width      size
22  84.423500   91.315781   2.492503    17.500629   18.499458   2.0
33  61.671188   137.650848  1.305071    18.169079   20.138525   6.0
143 53.673630   634.536926  3.443243    1.000000    1.000000    1.0
231 202.459641  62.480145   2.156926    8.962382    46.061661   21.0
242 217.588333  98.111694   2.011893    15.964933   59.468643   20.0

您可以逐行遍历它并绘制省略号:

fig, ax = plt.subplot()

for i, row in final_scatters.iterrows():
    ax.add_artist(Ellipse(
        xy = (row['PRP'], row['RZ']),
        width = row['width'], 
        height = row['height'],
        alpha = 0.5  # in case you want some transparency 
    ))