Matplotlib：直接从 .csv 导入并绘制带有图例的多个时间序列

Question

我有几个电子表格，其中包含以逗号分隔 (.csv) 文件形式保存的数据，格式如下：第一行包含字符串形式的列标签（'Time'、'Parameter_1'...）。第一列数据是时间，随后的每一列都包含相应的参数数据，作为浮点数或整数。

我想在同一张图上根据时间绘制每个参数，参数图例直接来自 .csv 文件的第一行。

我的电子表格有不同数量的（列）参数要根据时间绘制；所以我想找到一个通用的解决方案，它也将直接从 .csv 文件中导出列数。

所附的最小工作示例显示了我试图使用 np.loadtxt（减去图例）实现的目标；但我找不到从 .csv 文件导入列标签以使用这种方法制作图例的方法。

np.genfromtext 提供了更多功能，但我对此并不熟悉，并且正在努力寻找一种使用它来执行上述操作的方法。

从 .csv 文件中以这种方式绘制数据一定是一个常见问题，但我一直无法在网上找到解决方案。非常感谢您的帮助和建议。

非常感谢

"""
Example data: Data.csv:
Time,Parameter_1,Parameter_2,Parameter_3
0,10,0,10
1,20,30,10
2,40,20,20
3,20,10,30  
"""
import numpy as np
import matplotlib.pyplot as plt

data = np.loadtxt('Data.csv', skiprows=1, delimiter=',') # skip the column labels
cols = data.shape[1] # get the number of columns in the array
for n in range (1,cols):
    plt.plot(data[:,0],data[:,n]) # plot each parameter against time
plt.xlabel('Time',fontsize=14)
plt.ylabel('Parameter values',fontsize=14)
plt.show()

Answer 1

函数 numpy.genfromtxt 更适用于具有缺失值的损坏表格，而不是您想要执行的操作。您可以做的只是在将文件交给 numpy.loadtxt 之前打开文件并阅读第一行。那么你甚至不需要跳过它。这是您上面的内容的编辑版本，它读取标签并制作图例：

"""
Example data: Data.csv:
Time,Parameter_1,Parameter_2,Parameter_3
0,10,0,10
1,20,30,10
2,40,20,20
3,20,10,30  
"""
import numpy as np
import matplotlib.pyplot as plt

#open the file
with open('Data.csv') as f:
    #read the names of the colums first
    names = f.readline().strip().split(',')
    #np.loadtxt can also handle already open files
    data = np.loadtxt(f, delimiter=',') # no skip needed anymore

cols = data.shape[1]
for n in range (1,cols):
    #labels go in here
    plt.plot(data[:,0],data[:,n],label=names[n])

plt.xlabel('Time',fontsize=14)
plt.ylabel('Parameter values',fontsize=14)

#And finally the legend is made
plt.legend()
plt.show()

Answer 2

这是我使用 genfromtxt 而不是 loadtxt 的最小工作示例，以防对其他人有帮助。我确信有更简洁和优雅的方法可以做到这一点（我总是很高兴收到关于如何改进我的编码的建设性批评），但它很有意义并且工作正常：

import numpy as np
import matplotlib.pyplot as plt

arr = np.genfromtxt('Data.csv', delimiter=',', dtype=None) # dtype=None automatically defines appropriate format (e.g. string, int, etc.) based on cell contents
names = (arr[0])  # select the first row of data = column names
for n in range (1,len(names)):  # plot each column in turn against column 0 (= time)
    plt.plot (arr[1:,0],arr[1:,n],label=names[n]) # omitting the first row ( = column names)
plt.legend()    
plt.show()

Matplotlib：直接从 .csv 导入并绘制带有图例的多个时间序列

Matplotlib: Import and plot multiple time series with legends direct from .csv

csv

import

excel

numpy

matplotlib