R中直方图中的非均匀箱
Non-uniform bins in Histogram in R
我想将数据集(数值向量)分成一些区间,并生成频率直方图以查看哪些值落入每个区间。如果我使用 hist(dataset, breaks = 10)
这会将数据集分成 10 个相等的间隔。相反,我想将数据集分成(例如)10 个箱子,每个间隔至少包含 5% 的数据点。
您可以使用 quantile()
函数来定义大小相等的 bin。
这里有一个关于指数分布数据的例子:
# Seed for the random number generation (for repeatability)
seed = 1313
# Sample size
N = 150
# Size of each bin (as proportion of N)
binsize = 0.05
# Sample data
x = rexp(N)
# Regular histogram (equal-width bins)
hist(x, breaks=20, freq=TRUE, main="Histogram on 20 equal-width bins", col="red")
# Quantiles of size `binsize`
x.quantiles = quantile(x, probs=seq(0, 1, binsize))
# Histogram on the equal-size breaks
hist(x, breaks=x.quantiles, freq=TRUE, main=paste("Approx. equal-size-bin 'Histogram' (bin-size=", binsize*100, "% of ", N, ")", sep=""), col="cyan")
我想将数据集(数值向量)分成一些区间,并生成频率直方图以查看哪些值落入每个区间。如果我使用 hist(dataset, breaks = 10)
这会将数据集分成 10 个相等的间隔。相反,我想将数据集分成(例如)10 个箱子,每个间隔至少包含 5% 的数据点。
您可以使用 quantile()
函数来定义大小相等的 bin。
这里有一个关于指数分布数据的例子:
# Seed for the random number generation (for repeatability)
seed = 1313
# Sample size
N = 150
# Size of each bin (as proportion of N)
binsize = 0.05
# Sample data
x = rexp(N)
# Regular histogram (equal-width bins)
hist(x, breaks=20, freq=TRUE, main="Histogram on 20 equal-width bins", col="red")
# Quantiles of size `binsize`
x.quantiles = quantile(x, probs=seq(0, 1, binsize))
# Histogram on the equal-size breaks
hist(x, breaks=x.quantiles, freq=TRUE, main=paste("Approx. equal-size-bin 'Histogram' (bin-size=", binsize*100, "% of ", N, ")", sep=""), col="cyan")