Python：从第二列和第三列绘制，同时从第一列选取参数值

Question

我在一个名为“sample1.dat”的文件中有三列数据，还有一个代码读取这些列并尝试将第 3 列与第 2 列进行对比。我从第一列元素中获取参数值，只要它们的值保持不变即可。

"sample1.dat" 读取

0   1   1
0   2   4
0   3   9
0   4   16
0   5   25
0   6   36
1   1   1
1   2   8
1   3   27
1   4   64
1   5   125
1   6   216
2   1   1
2   2   16
2   3   81
2   4   256
2   5   625
2   6   1296

还有我的代码：

import matplotlib.pyplot as plt
import numpy as np

data = np.loadtxt('sample1.dat')
x = data[:,0] 
y = data[:,1] 
z = data[:,2]
L = len(data)

col = ['r','g','b']
x0 = x[0]; j=0; jold=-1


for i in range(L):
  print('j, col[j]=',j, col[j])
  if x[i] == x0:
     print('y[i], z[i]=',y[i],z[i])
     if i==0 or j != jold: # j-index decides new or the same paramet
         label = 'parameter = {}'.format(x0)
     else:
         label = ''
     print('label =',label)
     plt.plot(y[i], z[i], color=col[j], marker='o', label=label)
  else:
     x0 = x[i] # Update when x-value changes, 
            # i.e. pick up the next parameter value
     i -= 1 # Shift back else we miss the 1st point for new x-value 
     j += 1; jold = j

plt.legend()
plt.xlabel('2nd column') 
plt.ylabel('3rd column')
plt.savefig('sample1.png') 
plt.show()

剧情结局：

可以清楚地看到两个问题仍然存在：

图例只出现在第一个参数中，尽管我试图避免在我的代码中重复。
虽然图例显示线加标记图，但未出现默认线型。

我该如何解决这些问题，或者是否有更智能的编码方式来实现相同的目的。

Answer 1

第一个问题是由于一些涉及 j、jold 和 x0 的奇怪逻辑。可以通过一次为每个 x 值绘制所有 y,z 来简化代码。 Numpy 允许选择对应于给定 x0 的 y 作为 y[x==x0s].

第二个问题可以通过显式设置所需的线型来解决，即 ls=''。

import matplotlib.pyplot as plt
import numpy as np

data = np.loadtxt('sample1.dat')
x = data[:, 0]
y = data[:, 1]
z = data[:, 2]
colors = ['r', 'g', 'b']

for x0, color in zip(np.unique(x), colors):
    plt.plot(y[x == x0], z[x == x0], color=color, marker='o', ls='', label=f'parameter = {x0:.0f}')

plt.legend()
plt.xlabel('2nd column')
plt.ylabel('3rd column')
plt.show()

另一种方法是使用 seaborn 库，它无需大量干预即可进行选择和着色，例如：

import seaborn as sns

sns.scatterplot(x=y, y=z, hue=x, palette=['r', 'g', 'b'])

如果数据组织为字典或 pandas 数据框，Seaborn 可以自动添加标签：

data = {'first column': x.astype(int),
        'second column': y,
        'third column': z}
sns.scatterplot(data=data, x='second column', y='third column', hue='first column', palette=['r', 'g', 'b'])

Answer 2

使用 pandas 和 seaborn，几行就可以得到你想要的结果。如果您将列名称（例如 A、B 和 C）添加到 sample1.dat 文件中的数据，如下所示：

A   B   C
0   1   1
0   2   4
0   3   9
0   4   16
0   5   25
0   6   36
1   1   1
1   2   8
1   3   27
1   4   64
1   5   125
1   6   216
2   1   1
2   2   16
2   3   81
2   4   256
2   5   625
2   6   1296

然后您可以将数据加载到 pandas 数据框中并使用 seaborn 绘制它：

import pandas as pd
import seaborn as sns

df=pd.read_fwf('sample1.dat')
col = ['r','g','b']
sns.scatterplot(data=df,x='B',y='C',hue='A',palette=col)

并且输出给出：

Python：从第二列和第三列绘制，同时从第一列选取参数值

Python: Plot from second and third columns while picking parameter values from the first one

python

plot

matplotlib

python-3.x