Python Dash 刷新页面未更新源数据

Python Dash refresh page not updating source data

我编写了一个基本的 plotly dash 应用程序,它从 csv 中提取数据并将其显示在图表上。 然后您可以在应用程序上切换值并更新图表。

但是,当我向 csv 添加新数据时(每天完成一次),应用程序不会在刷新页面时更新数据。

解决方法通常是将 app.layout 定义为一个函数,如 here 所述(向下滚动以在页面加载时更新)。您会在下面的代码中看到我已经做到了。

这是我的代码:

import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import numpy as np

import pandas as pd

external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']

app = dash.Dash(__name__, external_stylesheets=external_stylesheets)

path = 'https://raw.githubusercontent.com/tbuckworth/Public/master/CSVTest.csv'

df = pd.read_csv(path)
df2 = df[(df.Map==df.Map)]


def layout_function():

    df = pd.read_csv(path)
    df2 = df[(df.Map==df.Map)]
    
    available_strats = np.append('ALL',pd.unique(df2.Map.sort_values()))
    classes1 = pd.unique(df2["class"].sort_values())
    metrics1 = pd.unique(df2.metric.sort_values())
    
    return html.Div([
            html.Div([
                dcc.Dropdown(
                    id="Strategy",
                    options=[{"label":i,"value":i} for i in available_strats],
                    value=list(available_strats[0:1]),
                    multi=True
                ),
                dcc.Dropdown(
                    id="Class1",
                    options=[{"label":i,"value":i} for i in classes1],
                    value=classes1[0]
                ),
                dcc.Dropdown(
                    id="Metric",
                    options=[{"label":i,"value":i} for i in metrics1],
                    value=metrics1[0]
                )],
            style={"width":"20%","display":"block"}),
                
        html.Hr(),
    
        dcc.Graph(id='Risk-Report')          
    ])
            
app.layout = layout_function


@app.callback(
        Output("Risk-Report","figure"),
        [Input("Strategy","value"),
         Input("Class1","value"),
         Input("Metric","value"),
         ])

def update_graph(selected_strat,selected_class,selected_metric):
    if 'ALL' in selected_strat:
        df3 = df2[(df2["class"]==selected_class)&(df2.metric==selected_metric)]
    else:
        df3 = df2[(df2.Map.isin(selected_strat))&(df2["class"]==selected_class)&(df2.metric==selected_metric)]
    df4 = df3.pivot_table(index=["Fund","Date","metric","class"],values="value",aggfunc="sum").reset_index()
    traces = []
    for i in df4.Fund.unique():
        df_by_fund = df4[df4["Fund"] == i]
        traces.append(dict(
                x=df_by_fund["Date"],
                y=df_by_fund["value"],
                mode="lines",
                name=i
                ))
    
    if selected_class=='USD':
        tick_format=None
    else:
        tick_format='.2%'
    
    return {
            'data': traces,
            'layout': dict(
                xaxis={'type': 'date', 'title': 'Date'},
                yaxis={'title': 'Values','tickformat':tick_format},
                margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
                legend={'x': 0, 'y': 1},
                hovermode='closest'
            )
        }
    

if __name__ == '__main__':
    app.run_server(debug=True)

我尝试过的事情

  1. 删除 def layout_function(): 之前的初始 df = pd.read_csv(path)。这会导致错误。
  2. 使用此代码创建回调按钮以刷新数据:
@app.callback(
        Output('Output-1','children'),
        [Input('reload_button','n_clicks')]        
        )

def update_data(nclicks):
    if nclicks == 0:
        raise PreventUpdate
    else:
        df = pd.read_csv(path)
        df2 = df[(df.Map==df.Map)]
        return('Data refreshed. Click to refresh again')

这不会产生错误,但按钮也不会刷新数据。

  1. update_graph 回调中定义 df。这会在您每次切换某些内容时更新数据,这是不切实际的(我的真实数据是 > 10^6 行,所以我不想每次用户更改切换值时都读取它)

简而言之,我认为定义 app.layout = layout_function 应该可以完成这项工作,但事实并非如此。我 missing/not 看到了什么?

感谢任何帮助。

TLDR;我建议您只需从回调中加载数据。如果加载时间太长,您可以更改格式(例如 feather) and/or reduce the data size via pre processing. If this is still not fast enough, the next step would be to store the data in a server-side in-memory cache such as Redis.


由于您在 layout_function 中重新分配 dfdf2,这些变量被视为 local in Python, and you are thus not modifying the df and df2 variables from the global scope. While you could achieve this behavior using the global keyword, the use of global variables is discouraged in Dash

Dash 中的标准方法是在回调中(或在 layout_function 中)加载数据并将其存储在 Store 对象(或等效地,隐藏的 Div).结构类似于

import pandas as pd
import dash_core_components as dcc
from dash.dependencies import Output, Input

app.layout = html.Div([
    ...
    dcc.Store(id="store"), html.Div(id="trigger")
])

@app.callback(Output('store','data'), [Input('trigger','children')], prevent_initial_call=False)
def update_data(children):
    df = pd.read_csv(path)
    return df.to_json()

@app.callback(Output("Risk-Report","figure"), [Input(...)], [State('store', 'data')])
def update_graph(..., data):
    if data is None:
        raise PreventUpdate
    df = pd.read_json(data)
    ...

但是,这种方法通常 比仅在回调中从磁盘读取数据(这似乎是您试图避免的)慢得多,因为它会导致在服务器和客户端之间传输的数据。