如何在数据点位于不同列时从 CSV 绘制多个时间序列

Question

我有一个数据框（从 CSV 加载）文件，如下所示

     Data      Mean        sd   time__1   time__2   time__3   time__4   time__5
0  Data_1  0.947667  0.025263  0.501517  0.874750  0.929426  0.953847  0.958375
1  Data_2  0.031960  0.017314  0.377588  0.069185  0.037523  0.024028  0.021532

现在，我想为 (data_1、data_2) 绘制 2 个时间序列图，并以 (time__1、time__2 等) 作为时间点。 x axis 是（time__1、time__2 等），y axis 是它们的关联值。

我正在尝试的代码

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = pd.read_csv("file.csv", delimiter=',', header=0) 
data = data.drop(["Unnamed: 0"], axis=1)

# Set the date column as the index
data = data.set_index(["time__1", "time__2", "time__3", "time__4", "time__5"])
ax = data.plot(linewidth=2, fontsize=12)
ax.set_xlabel('Data')
ax.legend(fontsize=12)
plt.savefig("series.png")
plt.show()

我得到的数字与预期不符。

我认为我在 set_index() 上做错了，因为我的时间点在不同的列中。

当时间点在不同的列中时，如何绘制时间序列？

作为字典格式的可重现数据

{'Data': {(0.501517236232758, 0.874750375747681, 0.929425954818726, 0.953846752643585, 0.958374977111816): 'Data_1', (0.377588421106338, 0.069185301661491, 0.037522859871388, 0.0240284409374, 0.021532088518143): 'Data_2'}, 'Mean': {(0.501517236232758, 0.874750375747681, 0.929425954818726, 0.953846752643585, 0.958374977111816): 0.947667360305786, (0.377588421106338, 0.069185301661491, 0.037522859871388, 0.0240284409374, 0.021532088518143): 0.031959813088179}, 'sd': {(0.501517236232758, 0.874750375747681, 0.929425954818726, 0.953846752643585, 0.958374977111816): 0.025263005867601, (0.377588421106338, 0.069185301661491, 0.037522859871388, 0.0240284409374, 0.021532088518143): 0.017313838005066}}

Answer 1

IIUC 你弄错了索引：如果 time__1、time__2 等应该是你的 x-axis，那就是你想要的索引。绘图数据系列名称是列。因此，您需要转置您的 DataFrame。在您的第一个 table:

中使用 csv 数据

print(df)

# out:
     Data      Mean        sd   time__1   time__2   time__3   time__4  \
0  Data_1  0.947667  0.025263  0.501517  0.874750  0.929426  0.953847   
1  Data_2  0.031960  0.017314  0.377588  0.069185  0.037523  0.024028   

    time__5  
0  0.958375  
1  0.021532

更改列名和转置：

df.drop(["Mean", "sd"], axis=1).set_index("Data").T

生成格式正确的数据框：

Data       Data_1    Data_2
time__1  0.501517  0.377588
time__2  0.874750  0.069185
time__3  0.929426  0.037523
time__4  0.953847  0.024028
time__5  0.958375  0.021532

可以简单地绘制：

df.plot()

如何在数据点位于不同列时从 CSV 绘制多个时间序列

How to plot multiple time series from a CSV while the data points are in different columns

python

matplotlib

python-3.x

pandas