直方图箱的累积频率分布）插入

Question

我想问一下是否有办法将插入的 时间序列绘制到直方图 bin(s)。

输入是一个包含 x 和 y 值（整数）的大列表。我可以使用 plt.hist() 和多个 bin 轻松生成直方图，但我希望看到随着时间的推移（x 值）插入 bin 0。

所以在绘图中，我将有一个 x 轴显示时间，一个 y 轴显示计数，每个 bin 都有一条累积线。

谢谢

Answer 1

我不确定我对你的问题的理解是否正确，但我倾向于认为你想要随着时间的推移可视化给定 bin 中的值插入，因为这些值一个接一个地出现。

In [1]: import numpy
   ...: import pandas
   ...: import matplotlib.pyplot as plt

# Let's first create the example data set...
In [2]: lowest, highest = 0, 10
   ...: amount_of_samples = 100
   ...: samples = numpy.random.randint(lowest, highest+1, amount_of_samples)

# ...then make 6 bins...
In [3]: amount_of_bins = 6
   ...: bins = [
   ...:     (i*highest/amount_of_bins, (i+1)*highest/amount_of_bins)
   ...:     for i in range(amount_of_bins)
   ...: ]

# ...and, finally, assign each value to the appropriate bin.
In [4]: df = pandas.DataFrame(
   ...:     data=[
   ...:         [1 if (sample >= interval[0] and sample < interval[1]) else 0
   ...:          for interval in bins]
   ...:         for sample in samples],
   ...:     columns=[f'{bins[i][0]:.1f}-{bins[i][1]:.1f}' for i in range(6)],
   ...: )

# We need to account for the fact that the above command excludes
# the highest value from the bins
In [5]: last_bin_column = f'{bins[-1][0]:.1f}-{bins[-1][1]:.1f}'
   ...: df[last_bin_column] += (samples == highest)

# We then get a cumulative sum for each bin, through time
In [6]: df = df.cumsum()

In [7]: ax = df.plot()

In [8]: ax.set_xlabel('time')
Out[8]: Text(0.5, 0, 'time')

In [9]: plt.savefig('hist.jpg', bbox_inches='tight')

直方图箱的累积频率分布）插入

Cummulative Frequency Distribution of a Histogram Bin(s) Inserts

python

matplotlib

histogram

pandas