直方图箱的累积频率分布)插入
Cummulative Frequency Distribution of a Histogram Bin(s) Inserts
我想问一下是否有办法将插入的 时间序列绘制到直方图 bin(s)。
输入是一个包含 x 和 y 值(整数)的大列表。我可以使用 plt.hist()
和多个 bin 轻松生成直方图,但我希望看到随着时间的推移(x 值)插入 bin 0。
所以在绘图中,我将有一个 x 轴显示时间,一个 y 轴显示计数,每个 bin 都有一条累积线。
谢谢
我不确定我对你的问题的理解是否正确,但我倾向于认为你想要随着时间的推移可视化给定 bin 中的值插入,因为这些值一个接一个地出现。
In [1]: import numpy
...: import pandas
...: import matplotlib.pyplot as plt
# Let's first create the example data set...
In [2]: lowest, highest = 0, 10
...: amount_of_samples = 100
...: samples = numpy.random.randint(lowest, highest+1, amount_of_samples)
# ...then make 6 bins...
In [3]: amount_of_bins = 6
...: bins = [
...: (i*highest/amount_of_bins, (i+1)*highest/amount_of_bins)
...: for i in range(amount_of_bins)
...: ]
# ...and, finally, assign each value to the appropriate bin.
In [4]: df = pandas.DataFrame(
...: data=[
...: [1 if (sample >= interval[0] and sample < interval[1]) else 0
...: for interval in bins]
...: for sample in samples],
...: columns=[f'{bins[i][0]:.1f}-{bins[i][1]:.1f}' for i in range(6)],
...: )
# We need to account for the fact that the above command excludes
# the highest value from the bins
In [5]: last_bin_column = f'{bins[-1][0]:.1f}-{bins[-1][1]:.1f}'
...: df[last_bin_column] += (samples == highest)
# We then get a cumulative sum for each bin, through time
In [6]: df = df.cumsum()
In [7]: ax = df.plot()
In [8]: ax.set_xlabel('time')
Out[8]: Text(0.5, 0, 'time')
In [9]: plt.savefig('hist.jpg', bbox_inches='tight')
我想问一下是否有办法将插入的 时间序列绘制到直方图 bin(s)。
输入是一个包含 x 和 y 值(整数)的大列表。我可以使用 plt.hist()
和多个 bin 轻松生成直方图,但我希望看到随着时间的推移(x 值)插入 bin 0。
所以在绘图中,我将有一个 x 轴显示时间,一个 y 轴显示计数,每个 bin 都有一条累积线。
谢谢
我不确定我对你的问题的理解是否正确,但我倾向于认为你想要随着时间的推移可视化给定 bin 中的值插入,因为这些值一个接一个地出现。
In [1]: import numpy
...: import pandas
...: import matplotlib.pyplot as plt
# Let's first create the example data set...
In [2]: lowest, highest = 0, 10
...: amount_of_samples = 100
...: samples = numpy.random.randint(lowest, highest+1, amount_of_samples)
# ...then make 6 bins...
In [3]: amount_of_bins = 6
...: bins = [
...: (i*highest/amount_of_bins, (i+1)*highest/amount_of_bins)
...: for i in range(amount_of_bins)
...: ]
# ...and, finally, assign each value to the appropriate bin.
In [4]: df = pandas.DataFrame(
...: data=[
...: [1 if (sample >= interval[0] and sample < interval[1]) else 0
...: for interval in bins]
...: for sample in samples],
...: columns=[f'{bins[i][0]:.1f}-{bins[i][1]:.1f}' for i in range(6)],
...: )
# We need to account for the fact that the above command excludes
# the highest value from the bins
In [5]: last_bin_column = f'{bins[-1][0]:.1f}-{bins[-1][1]:.1f}'
...: df[last_bin_column] += (samples == highest)
# We then get a cumulative sum for each bin, through time
In [6]: df = df.cumsum()
In [7]: ax = df.plot()
In [8]: ax.set_xlabel('time')
Out[8]: Text(0.5, 0, 'time')
In [9]: plt.savefig('hist.jpg', bbox_inches='tight')