在两个 pandas 系列点之间绘制一条无限线

Plot an infinite line between two pandas series points

我想在 pandas 系列形式的两点之间绘制一条无限的非结束线。我能够成功地在点之间绘制一条标准线,但是我不希望这条线“结束”,而是应该继续。对此进行扩展,我还想将这条新的无限线的值提取到一个新的数据框中,这样我就可以看到给定的 x 值具有什么对应的线值。

data = yf.download("AAPL", start="2021-01-01", interval = "1d").drop(columns=['Adj Close'])
data = data[30:].rename(columns={"Open": "open", "High": "high", "Low": "low", "Close": "close", "Volume": "volume"})
local_max = argrelextrema(data['high'].values, np.greater)[0]
local_min = argrelextrema(data['low'].values, np.less)[0]
highs = data.iloc[local_max,:]
lows = data.iloc[local_min,:]

highesttwo = highs["high"].nlargest(2)
lowesttwo = lows["low"].nsmallest(2)

fig = plt.figure(figsize=[10,7])
data['high'].plot(marker='o', markevery=local_max)
data['low'].plot(marker='o', markevery=local_min)
highesttwo.plot()
lowesttwo.plot()
plt.show()

目前我的剧情是这样的:

我多么希望它看起来像这样,并且能够获得相应 x 值的行的值。

这可以通过以下示例中所示的几个步骤完成,其中使用线方程的 slope-intercept form 通过元素运算(即矢量化)计算线。

股票数据的频率取决于证券交易所的开盘日期。此频率不会被 pandas 自动识别,因此 .plot 方法会生成一个图表,其中 x 轴的日期是连续的,并且包括没有数据的日子。这可以通过设置参数 use_index=False 来避免,以便 x 轴使用从零开始的整数。

接下来的挑战是创建格式良好的刻度标签。以下示例尝试通过使用列表推导来模仿 pandas 刻度格式 select 刻度位置并格式化标签。如果日期范围显着延长或缩短,则需要进行调整。

import numpy as np                      # v 1.19.2
import pandas as pd                     # v 1.2.3
import matplotlib.pyplot as plt         # v 3.3.4
from scipy.signal import argrelextrema  # v 1.6.1
import yfinance as yf                   # v 0.1.54

# Import data
data = (yf.download('AAPL', start='2021-01-04', end='2021-03-15', interval='1d')
         .drop(columns=['Adj Close']))
data = data.rename(columns={'Open': 'open', 'High': 'high', 'Low': 'low',
                            'Close': 'close', 'Volume': 'volume'})

# Extract points and get appropriate x values for the points by using
# reset_index for highs/lows
local_max = argrelextrema(data['high'].values, np.greater)[0]
local_min = argrelextrema(data['low'].values, np.less)[0]
highs = data.reset_index().iloc[local_max, :]
lows = data.reset_index().iloc[local_min, :]
htwo = highs['high'].nlargest(2).sort_index()
ltwo = lows['low'].nsmallest(2).sort_index()

# Compute slope and y-intercept for each line
slope_high, intercept_high = np.polyfit(htwo.index, htwo, 1)
slope_low, intercept_low = np.polyfit(ltwo.index, ltwo, 1)

# Create dataframe for each line by using reindexed htwo and ltwo so that the
# index extends to the end of the dataset and serves as the x variable then
# compute y values
# High
line_high = htwo.reindex(range(htwo.index[0], len(data))).reset_index()
line_high.columns = ['x', 'y']
line_high['y'] = slope_high*line_high['x'] + intercept_high
# Low
line_low = ltwo.reindex(range(ltwo.index[0], len(data))).reset_index()
line_low.columns = ['x', 'y']
line_low['y'] = slope_low*line_low['x'] + intercept_low

# Plot data using pandas plotting function and add lines with matplotlib function
fig = plt.figure(figsize=[10,6])
ax = data['high'].plot(marker='o', markevery=local_max, use_index=False)
data['low'].plot(marker='o', markevery=local_min, use_index=False)
ax.plot(line_high['x'], line_high['y'])
ax.plot(line_low['x'], line_low['y'])
ax.set_xlim(0, len(data)-1)

# Set major and minor tick locations
tks_maj = [idx for idx, timestamp in enumerate(data.index)
           if (timestamp.month != data.index[idx-1].month) | (idx == 0)]
tks_min = range(len(data))
ax.set_xticks(tks_maj)
ax.set_xticks(tks_min, minor=True)

# Format major and minor tick labels
labels_maj = [ts.strftime('\n%b\n%Y') if (data.index[tks_maj[idx]].year
              != data.index[tks_maj[idx-1]].year) | (idx == 0)
              else ts.strftime('\n%b') for idx, ts in enumerate(data.index[tks_maj])]
labels_min = [ts.strftime('%d') if (idx+3)%5 == 0 else ''
              for idx, ts in enumerate(data.index[tks_min])]
ax.set_xticklabels(labels_maj)
ax.set_xticklabels(labels_min, minor=True)

plt.show()



您可以找到更多刻度格式示例 here and

Date string format codes