使用 pyplot 在 x 轴上打印月份

Printing months in the x axis with pyplot

我正在处理的数据:https://drive.google.com/file/d/1xb7icmocz-SD2Rkq4ykTZowxW0uFFhBl/view?usp=sharing

大家好,

我有点难以编辑情节。 基本上,我希望我的 x 值显示一年中的月份,但由于数据类型(?),它似乎不起作用。你知道我怎样才能让我的情节在 x 轴上有几个月吗?

如果您需要有关数据的更多背景信息,请告诉我!!!

谢谢!

这是我的绘图代码和初始数据修改:

import matplotlib.pyplot as plt
import mplleaflet
import pandas as pd
import matplotlib.dates as mdates
from matplotlib.dates import DateFormatter
import numpy as np

df = pd.read_csv("data/C2A2_data/BinnedCsvs_d400/fb441e62df2d58994928907a91895ec62c2c42e6cd075c2700843b89.csv")

df['degrees']=df['Data_Value']/10

df['Date'] =  pd.to_datetime(df['Date'])

df2 = df[df['Date']<'2015-01-01']

df3 = df[df['Date']>='2015-01-01']

max_temp = df2.groupby([(df2.Date.dt.month),(df2.Date.dt.day)])['degrees'].max()
min_temp = df2.groupby([(df2.Date.dt.month),(df2.Date.dt.day)])['degrees'].min()

max_temp2 = df3.groupby([(df3.Date.dt.month),(df3.Date.dt.day)])['degrees'].max()
min_temp2 = df3.groupby([(df3.Date.dt.month),(df3.Date.dt.day)])['degrees'].min()

max_temp.plot(x ='Date', y='degrees', kind = 'line')
min_temp.plot(x ='Date',y='degrees', kind= 'line')

plt.fill_between(range(len(min_temp)),min_temp, max_temp, color='C0', alpha=0.2)

ax = plt.gca()

ax.set(xlabel="Date",
       ylabel="Temperature",
       title="Extreme Weather in 2015")

plt.legend()
plt.tight_layout()

x = plt.gca().xaxis

for item in x.get_ticklabels():
    item.set_rotation(45)

plt.show()

我得到的情节:

选项 1(最相似的方法)

使用 Index.map and calendar

根据月份缩写更改索引

这仅适用于 df2:

import calendar

import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("...")
df['degrees'] = df['Data_Value'] / 10
df['Date'] = pd.to_datetime(df['Date'])
df2 = df[df['Date'] < '2015-01-01']

max_temp = df2.groupby([df2.Date.dt.month, df2.Date.dt.day])['degrees'].max()
min_temp = df2.groupby([df2.Date.dt.month, df2.Date.dt.day])['degrees'].min()

# Update the index to be the desired display format for x-axis
max_temp.index = max_temp.index.map(lambda x: f'{calendar.month_abbr[x[0]]}')
min_temp.index = min_temp.index.map(lambda x: f'{calendar.month_abbr[x[0]]}')

max_temp.plot(x='Date', y='degrees', kind='line')
min_temp.plot(x='Date', y='degrees', kind='line')

plt.fill_between(range(len(min_temp)), min_temp, max_temp,
                 color='C0', alpha=0.2)

ax = plt.gca()
ax.set(xlabel="Date", ylabel="Temperature", title="Extreme Weather 2005-2014")

x = plt.gca().xaxis
for item in x.get_ticklabels():
    item.set_rotation(45)
    
plt.margins(x=0)
plt.legend()
plt.tight_layout()
plt.show()


顺便说一句:标题“2015 年的极端天气”是不正确的,因为该数据包括 2015 年 之前 的所有年份。这是 "Extreme Weather 2005-2014"

也可以使用 minmax 检查年份范围:

print(df2.Date.dt.year.min(), '-', df2.Date.dt.year.max())
# 2005 - 2014

可以通过以下方式以编程方式生成标题:

title=f"Extreme Weather {df2.Date.dt.year.min()}-{df2.Date.dt.year.max()}"

选项 2(按步骤简化分组)

使用 groupby aggregate 简化代码以创建单个 DataFrame,然后以与上述相同的方式转换索引:

import calendar

import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("...")
df['degrees'] = df['Data_Value'] / 10
df['Date'] = pd.to_datetime(df['Date'])
df2 = df[df['Date'] < '2015-01-01']
# Get Max and Min Degrees in Single Groupby
df2_temp = (
    df2.groupby([df2.Date.dt.month, df2.Date.dt.day])['degrees']
        .agg(['max', 'min'])
)
# Convert Index to whatever display format is desired:
df2_temp.index = df2_temp.index.map(lambda x: f'{calendar.month_abbr[x[0]]}')

# Plot
ax = df2_temp.plot(
    kind='line', rot=45,
    xlabel="Date", ylabel="Temperature",
    title=f"Extreme Weather {df2.Date.dt.year.min()}-{df2.Date.dt.year.max()}"
)
# Fill between
plt.fill_between(range(len(df2_temp)), df2_temp['min'], df2_temp['max'],
                 color='C0', alpha=0.2)
plt.margins(x=0)
plt.tight_layout()
plt.show()


选项 3(最佳整体功能)

使用 pd.to_datetime 将索引转换为日期时间。选择任何闰年以统一数据(它必须是闰年,因此 Feb-29 不会引发错误)。然后使用格式字符串 %b 设置 set_major_formatter 以使用月份缩写:

import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("...")
df['degrees'] = df['Data_Value'] / 10
df['Date'] = pd.to_datetime(df['Date'])
df2 = df[df['Date'] < '2015-01-01']
# Get Max and Min Degrees in Single Groupby
df2_temp = (
    df2.groupby([df2.Date.dt.month, df2.Date.dt.day])['degrees']
        .agg(['max', 'min'])
)

# Convert to DateTime of Same Year
# (Must be a leap year so Feb-29 doesn't raise an error)
df2_temp.index = pd.to_datetime(
    '2000-' + df2_temp.index.map(lambda s: '-'.join(map(str, s)))
)
# Plot
ax = df2_temp.plot(
    kind='line', rot=45,
    xlabel="Date", ylabel="Temperature",
    title=f"Extreme Weather {df2.Date.dt.year.min()}-{df2.Date.dt.year.max()}"
)
# Fill between
plt.fill_between(df2_temp.index, df2_temp['min'], df2_temp['max'],
                 color='C0', alpha=0.2)

# Set xaxis formatter to month abbr with the %b format string
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
plt.tight_layout()
plt.show()

这种方法的好处是索引是一个日期时间,因此比选项 1 和 2 的字符串表示格式更好。