python 如何给直方图中的 bin 分配特定值？

Question

亲爱的计算机科学家大家庭

我想知道是否可以在直方图中将我赋予某个 bin 的任何值分配给它。如果您在我的代码中注意到，它会生成一个直方图，其中包含 2 个 bin，其中填充的数量为 1。

# -*- coding: utf-8 -*-
"""
Created on Sat May  9 20:23:51 2020

@author: DeAngelo
"""

import matplotlib.pyplot as plt
import numpy as np
import math





fig,ax = plt.subplots(1,1)
a = np.array([11,75])
ax.hist(a, bins = [0,25,50,75,100])
ax.set_title("histogram of result")
ax.set_xticks([0,25,50,75,100])
ax.set_xlabel('marks')
ax.set_ylabel('no. of students')
plt.show()

首先，从理论上讲，您能否告诉计算机您想要将指定的值放入 75-100 箱中。并将其移动到 0-25 bin。这意味着我现在在 0-25 bin 中有 2 个条目。但是我的数组仍然是 a=[11,75]

另外一个例子是我有一个数组 'b=np.array[3]' 并且我将其绘制在我的直方图上。我知道这会被分配到 0-25 的 bin 中，但我可以告诉计算机将它放入 75-100 的 bin 中吗？

如果可以怎么办？

其次，我知道您可以使用 np.mean(a) 来计算平均值。但是假设我想将该值放入对应于 75-100 的容器中。可以吗？

我看过这段代码 How to assign a number to a value falling in a certain bin ，但那是在 古埃及象形文字 中，不幸的是我的学位是物理学而不是物理学。

如果你能帮我解决这个问题，那对我来说意义重大。<3

Answer 1

直方图仅表示为条形图，因此您可以操纵条形值。在这里您可以预先计算直方图并将其绘制为条形图：

import matplotlib.pyplot as plt
import numpy as np
import math

a = np.array([11,75])
# calculate histogram values
vals, bins = np.histogram(a, bins = [0,25,50,75,100])
width = np.ediff1d(bins)

fig,ax = plt.subplots(1,1)

# plot histogram values as bar chart
ax.bar(bins[:-1] + width/2, vals, width)
ax.set_title("histogram of result")
ax.set_xticks([0,25,50,75,100])
ax.set_xlabel('marks')
ax.set_ylabel('no. of students')
plt.show()

这就是你的例子。但是，如果您愿意，您现在可以在绘图之前操纵条形值：

# the bin values
vals 
>>> array([1, 0, 0, 1])

# bin edges
bins
>>> array([  0,  25,  50,  75, 100])

# do manipulation -> remove one count from 75-100 bin and put in 0-25 bin
vals[-1] -= 1
vals[0] += 1

# plot new graph
fig,ax = plt.subplots(1,1)

# plot histogram values as bar chart
ax.bar(bins[:-1] + width/2, vals, width)
ax.set_title("histogram of result")
ax.set_xticks([0,25,50,75,100])
ax.set_xlabel('marks')
ax.set_ylabel('no. of students')
plt.show()

我不得不评论，你这样做的原因是什么？在您的示例中，您想计算平均值并将其放入错误的容器中。您当然可以按照我上面的说明那样做，但我不确定此时它意味着什么？

Answer 2

是的，这是可能的。您可以通过将其分配给变量来捕获直方图函数的 return 值：

h = ax.hist(a, bins = [0, 25, 50, 75, 100])
h

(array([1., 0., 0., 1.]),
 array([  0,  25,  50,  75, 100]),
 <a list of 4 Patch objects>)

正如 documentation 所说，这个 "is a tuple (n, bins, patches)"。我们只对计数和分箱感兴趣，所以让我们将它们分配给各个变量：

counts, bins, _ = h

现在您可以按照自己喜欢的任何方式操纵计数，例如将一个计数从第四个移动到第一个 bin：

counts[3] -= 1
counts[0] += 1
counts

array([2., 0., 0., 0.])

我们可以把这些数据变成直方图，如weights参数下的documentation所示：

plt.hist(bins[:-1], bins, weights=counts);

python 如何给直方图中的 bin 分配特定值？

How to assign a specific value to a bin in histogram in python?

python

numpy

matplotlib

histogram