如果我们有概率密度函数数据作为 x 和 y,则计算百分位数
Calculate percentiles if we have probability density function data as x and y
我有从 pdf 图形中提取的数据,其中 x 代表孵化时间,y 是 csv 文件中的密度。我想计算百分位数,比如 95%。我有点困惑,我应该只使用 x 值计算百分位数,即使用 np.precentile(x, 0.95)
?
图中数据:
这是一些使用 np.trapz 的代码(由@pjs 提出)。我们采用 x 和 y 数组,假设它是 PDF,所以首先我们将它归一化为 1,然后开始向后搜索直到我们达到 0.95 点。
我编了一些多峰函数
import numpy as np
import matplotlib.pyplot as plt
N = 1000
x = np.linspace(0.0, 6.0*np.pi, N)
y = np.sin(x/2.0)/x # construct some multi-peak function
y[0] = y[1]
y = np.abs(y)
plt.plot(x, y, 'r.')
plt.show()
# normalization
norm = np.trapz(y, x)
print(norm)
y = y/norm
print(np.trapz(y, x)) # after normalization
# now compute integral cutting right limit down by one
# with each iteration, stop as soon as we hit 0.95
for k in range(0, N):
if k == 0:
xx = x
yy = y
else:
xx = x[0:-k]
yy = y[0:-k]
v = np.trapz(yy, xx)
print(f"Integral {k} from {xx[0]} to {xx[-1]} is equal to {v}")
if v <= 0.95:
break
我已经测试了@Severin Pappadeux 方法和 np.percentile 并且 bith 给了我相同的 95% 的结果
这里是@Severin Pappadeux 的代码,但使用了我使用的数据:
import numpy as np
import matplotlib.pyplot as plt
x = [ 5. , 5.55, 6.1 , 6.65, 7.2 , 7.75, 8.3 , 8.85, 9.4 ,
9.95, 10.5 , 11.05, 11.6 , 12.15, 12.7 , 13.25, 13.8 , 14.35,
14.9 , 15.45, 16. ]
y = [0.03234577, 0.03401444, 0.03559847, 0.03719304, 0.03890566,
0.04084201, 0.04309067, 0.04570878, 0.04871024, 0.05205822,
0.05566298, 0.05938525, 0.06304516, 0.06643575, 0.06933978,
0.07154828, 0.07287886, 0.07319211, 0.0724044 , 0.0704957 ,
0.0675117 ]
N = len(x)
y[0] = y[1]
y = np.abs(y)
plt.plot(x, y, 'r.')
plt.show()
# normalization
norm = np.trapz(y, x)
print(norm)
y = y/norm
print(np.trapz(y, x)) # after normalization
# now compute integral cutting right limit down by one
# with each iteration, stop as soon as we hit 0.95
for k in range(0, N):
if k == 0:
xx = x
yy = y
else:
xx = x[0:-k]
yy = y[0:-k]
v = np.trapz(yy, xx)
print(f"Integral {k} from {xx[0]} to {xx[-1]} is equal to {v}")
if v <= 0.95:
break
# Outputs =
# 0.6057000785
# 1.0
# Integral 0 from 5.0 to 16.0 is equal to 1.0
# Integral 1 from 5.0 to 15.45 is equal to 0.9373418687777172
当我按照@Zeek 的建议在 x 上使用 np.percentile() 时:
np.percentile(x, 95)
# Output= 15.45
所以,这两种方法都给了我 15.45 作为 x[= 的 95 百分位 12=]
我有从 pdf 图形中提取的数据,其中 x 代表孵化时间,y 是 csv 文件中的密度。我想计算百分位数,比如 95%。我有点困惑,我应该只使用 x 值计算百分位数,即使用 np.precentile(x, 0.95)
?
图中数据:
这是一些使用 np.trapz 的代码(由@pjs 提出)。我们采用 x 和 y 数组,假设它是 PDF,所以首先我们将它归一化为 1,然后开始向后搜索直到我们达到 0.95 点。 我编了一些多峰函数
import numpy as np
import matplotlib.pyplot as plt
N = 1000
x = np.linspace(0.0, 6.0*np.pi, N)
y = np.sin(x/2.0)/x # construct some multi-peak function
y[0] = y[1]
y = np.abs(y)
plt.plot(x, y, 'r.')
plt.show()
# normalization
norm = np.trapz(y, x)
print(norm)
y = y/norm
print(np.trapz(y, x)) # after normalization
# now compute integral cutting right limit down by one
# with each iteration, stop as soon as we hit 0.95
for k in range(0, N):
if k == 0:
xx = x
yy = y
else:
xx = x[0:-k]
yy = y[0:-k]
v = np.trapz(yy, xx)
print(f"Integral {k} from {xx[0]} to {xx[-1]} is equal to {v}")
if v <= 0.95:
break
我已经测试了@Severin Pappadeux 方法和 np.percentile 并且 bith 给了我相同的 95% 的结果
这里是@Severin Pappadeux 的代码,但使用了我使用的数据:
import numpy as np
import matplotlib.pyplot as plt
x = [ 5. , 5.55, 6.1 , 6.65, 7.2 , 7.75, 8.3 , 8.85, 9.4 ,
9.95, 10.5 , 11.05, 11.6 , 12.15, 12.7 , 13.25, 13.8 , 14.35,
14.9 , 15.45, 16. ]
y = [0.03234577, 0.03401444, 0.03559847, 0.03719304, 0.03890566,
0.04084201, 0.04309067, 0.04570878, 0.04871024, 0.05205822,
0.05566298, 0.05938525, 0.06304516, 0.06643575, 0.06933978,
0.07154828, 0.07287886, 0.07319211, 0.0724044 , 0.0704957 ,
0.0675117 ]
N = len(x)
y[0] = y[1]
y = np.abs(y)
plt.plot(x, y, 'r.')
plt.show()
# normalization
norm = np.trapz(y, x)
print(norm)
y = y/norm
print(np.trapz(y, x)) # after normalization
# now compute integral cutting right limit down by one
# with each iteration, stop as soon as we hit 0.95
for k in range(0, N):
if k == 0:
xx = x
yy = y
else:
xx = x[0:-k]
yy = y[0:-k]
v = np.trapz(yy, xx)
print(f"Integral {k} from {xx[0]} to {xx[-1]} is equal to {v}")
if v <= 0.95:
break
# Outputs =
# 0.6057000785
# 1.0
# Integral 0 from 5.0 to 16.0 is equal to 1.0
# Integral 1 from 5.0 to 15.45 is equal to 0.9373418687777172
当我按照@Zeek 的建议在 x 上使用 np.percentile() 时:
np.percentile(x, 95)
# Output= 15.45
所以,这两种方法都给了我 15.45 作为 x[= 的 95 百分位 12=]