如何在 Plotly 中为折线图添加 95% 的置信区间?

How to add 95% confidence interval for a line chart in Plotly?

我有本福德测试结果,test_show

    Expected    Counts  Found   Dif AbsDif  Z_score
Sec_Dig                     
0   0.119679    4318    0.080052    -0.039627   0.039627    28.347781
1   0.113890    2323    0.043066    -0.070824   0.070824    51.771489
2   0.108821    1348    0.024991    -0.083831   0.083831    62.513122
3   0.104330    1298    0.024064    -0.080266   0.080266    60.975864
4   0.100308    3060    0.056730    -0.043579   0.043579    33.683738
5   0.096677    6580    0.121987    0.025310    0.025310    19.884178
6   0.093375    10092   0.187097    0.093722    0.093722    74.804141
7   0.090352    9847    0.182555    0.092203    0.092203    74.687841
8   0.087570    8439    0.156452    0.068882    0.068882    56.587749
9   0.084997    6635    0.123007    0.038010    0.038010    31.646817

我正在尝试使用 Plotly 绘制 Benford 结果,如下所示,

这是我目前试过的代码

import plotly.graph_objects as go


fig = go.Figure()
fig.add_trace(go.Bar(x=test_show.index,
                y=test_show.Found,
                name='Found',
                marker_color='rgb(55, 83, 109)',
                # color="color"
                ))
fig.add_trace(go.Scatter(x=test_show.index,
                y=test_show.Expected,
                mode='lines+markers',
                name='Expected'
                ))

fig.update_layout(
    title='Benfords Law',
    xaxis=dict(
        title='Digits',
        tickmode='linear',
        titlefont_size=16,
        tickfont_size=14),
    yaxis=dict(
        title='% Percentage',
        titlefont_size=16,
        tickfont_size=14,
    ),
    legend=dict(
        x=0,
        y=1.0,
        bgcolor='rgba(255, 255, 255, 0)',
        bordercolor='rgba(255, 255, 255, 0)'
    ))
fig.show()

如何为 test_show["Expected"] 添加置信区间?

从 Python 3.8 开始,您可以使用 NormalDist to calculate a confidence interval as explained in detail here。通过对该方法稍作调整,您可以使用两个 go.Scatter() 跟踪将其包含在 fig.add_traces() 的设置中,然后像这样为最后一个跟踪设置 fill='tonexty', fillcolor = 'rgba(255, 0, 0, 0.2)')

CI = confidence_interval(df.Expected, 0.95)
fig.add_traces([go.Scatter(x = df.index, y = df['Expected']+CI,
                           mode = 'lines', line_color = 'rgba(0,0,0,0)',
                           showlegend = False),
                go.Scatter(x = df.index, y = df['Expected']-CI,
                           mode = 'lines', line_color = 'rgba(0,0,0,0)',
                           name = '95% confidence interval',
                           fill='tonexty', fillcolor = 'rgba(255, 0, 0, 0.2)')])

请注意,此方法从非常有限的 df.Expected 系列计算置信区间。这可能不是您想要在这里做的。因此,让我知道这个初步建议对您的效果如何,然后我们可以从那里开始。

情节

完整代码:

import plotly.graph_objects as go
import pandas as pd
from statistics import NormalDist

def confidence_interval(data, confidence=0.95):
  dist = NormalDist.from_samples(data)
  z = NormalDist().inv_cdf((1 + confidence) / 2.)
  h = dist.stdev * z / ((len(data) - 1) ** .5)
  return h


df = pd.DataFrame({'Expected': {0: 0.119679,
                      1: 0.11389,
                      2: 0.108821,
                      3: 0.10432999999999999,
                      4: 0.10030800000000001,
                      5: 0.096677,
                      6: 0.093375,
                      7: 0.090352,
                      8: 0.08757000000000001,
                      9: 0.084997},
                     'Counts': {0: 4318,
                      1: 2323,
                      2: 1348,
                      3: 1298,
                      4: 3060,
                      5: 6580,
                      6: 10092,
                      7: 9847,
                      8: 8439,
                      9: 6635},
                     'Found': {0: 0.080052,
                      1: 0.043066,
                      2: 0.024991,
                      3: 0.024064,
                      4: 0.056729999999999996,
                      5: 0.12198699999999998,
                      6: 0.187097,
                      7: 0.182555,
                      8: 0.156452,
                      9: 0.12300699999999999},
                     'Dif': {0: -0.039626999999999996,
                      1: -0.070824,
                      2: -0.08383099999999999,
                      3: -0.08026599999999999,
                      4: -0.043579,
                      5: 0.02531,
                      6: 0.093722,
                      7: 0.092203,
                      8: 0.068882,
                      9: 0.03801},
                     'AbsDif': {0: 0.039626999999999996,
                      1: 0.070824,
                      2: 0.08383099999999999,
                      3: 0.08026599999999999,
                      4: 0.043579,
                      5: 0.02531,
                      6: 0.093722,
                      7: 0.092203,
                      8: 0.068882,
                      9: 0.03801},
                     'Z_scoreSec_Dig': {0: 28.347781,
                      1: 51.771489,
                      2: 62.513121999999996,
                      3: 60.975864,
                      4: 33.683738,
                      5: 19.884178,
                      6: 74.804141,
                      7: 74.687841,
                      8: 56.587749,
                      9: 31.646817}})

test_show = df
fig = go.Figure()
fig.add_trace(go.Bar(x=test_show.index,
                y=test_show.Found,
                name='Found',
                marker_color='rgb(55, 83, 109)',
                # color="color"
                ))
fig.add_trace(go.Scatter(x=test_show.index,
                y=test_show.Expected,
                mode='lines+markers',
                name='Expected'
                ))

fig.update_layout(
    title='Benfords Law',
    xaxis=dict(
        title='Digits',
        tickmode='linear',
        titlefont_size=16,
        tickfont_size=14),
    yaxis=dict(
        title='% Percentage',
        titlefont_size=16,
        tickfont_size=14,
    ),
    legend=dict(
        x=0,
        y=1.0,
        bgcolor='rgba(255, 255, 255, 0)',
        bordercolor='rgba(255, 255, 255, 0)'
    ))

CI = confidence_interval(df.Expected, 0.95)

fig.add_traces([go.Scatter(x = df.index, y = df['Expected']+CI,
                           mode = 'lines', line_color = 'rgba(0,0,0,0)',
                           showlegend = False),
                go.Scatter(x = df.index, y = df['Expected']-CI,
                           mode = 'lines', line_color = 'rgba(0,0,0,0)',
                           name = '95% confidence interval',
                           fill='tonexty', fillcolor = 'rgba(255, 0, 0, 0.2)')])

fig.show()