使用 multi select 下拉列表时,分类数据点不会显示在散点图上

Categorical data points are not displayed on scatter plot when using multi select drop-down

假设我们从 SQL 中提取了以下名为 df 的数据帧:

ProdHouse   Date_Year   Date_Month
Software6   2001    Jan
Software6   2020    Feb
Software1   2004    Mar
Software4   2004    Apr
Software5   2004    May
Software3   2009    Dec
Software5   1995    Dec
Software3   1995    Oct

objective是显示每个月的产品总数。使用下拉菜单选择年份。看起来,当 x 轴是分类的(即月份)时,它不显示数据点。但是,如果我用整数替换它,则会显示点数。

def serve_layout():
        session_id = str(uuid.uuid4())

    return   html.Div([ html.Div(session_id, id='session-id', style={'display': 'none'}),
    html.Label('Year'),
    dcc.Dropdown( id='year-dropdown',
        options=[
                   {'label': year ,'value': year} for year in df['Date_Year'].unique()
        ],
        value=[2020],#[df['Date_Year'].unique()],
        multi=True   ),
    dcc.Graph(id='graph-with-dropdown')      
    ] , style={'width':'33%','display':'inline-block'}  )



app.layout = serve_layout


@app.callback(
    dash.dependencies.Output('graph-with-dropdown', 'figure'),
    [dash.dependencies.Input('year-dropdown', 'value')]) # Add the marks as a State
def update_figure(selected_year):
    print('selected_year:   ', selected_year)
    filtered_df = df[df.Date_Year.isin(selected_year)]
    #filtered_df = df[df.Date_Year == selected_year]
    df_grouped =  filtered_df.groupby(['ProdHouse','Date_Month']).size().rename('Total_Active_Products').reset_index()
    traces=[]

    for i in filtered_df.ProdHouse.unique():        
        df_by_ProdHouse = df_grouped[df_grouped['ProdHouse'] == i]
        traces.append(go.Scatter(
            x=df_by_ProdHouse['Date_Month'], #df_by_ProdHouse['Total_Active_Products'],
            y=df_by_ProdHouse['Total_Active_Products'],
            ##text=df_by_ProdHouse['brand'],
            mode='markers',
            opacity=0.7,
            marker={
                'size': 15,
                'line': {'width': 0.5, 'color': 'white'}
            },
            name=i
     )    )
    return {
            'data': traces,
            'layout': dict(
            xaxis={'type': 'linear', 'title': 'Active Products Per Month'},
            yaxis={'title': 'Total Active Products'},
            margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
            legend={'x': 0, 'y': 1},
            hovermode='closest',
            transition = {'duration': 500},
    )
}

如何修改上面的代码,以便数据可以显示在绘图上?

这回答了与未显示的点相关的问题的第一部分。我设法通过将散点图更改为条形图来显示分类数据。由于图表已更改,我删除了模式和类型参数。

@app.callback(
    dash.dependencies.Output('graph-with-dropdown', 'figure'),
    [dash.dependencies.Input('year-dropdown', 'value')]) # Add the marks as a State
def update_figure(selected_year):
    print('selected_year:   ', selected_year)
    filtered_df = df[df.Date_Year.isin(selected_year)]
    df_grouped =  filtered_df.groupby(['ProdHouse','Date_Month']).size().rename('Total_Active_Products').reset_index()
    traces=[]

    for i in filtered_df.ProdHouse.unique():        
        df_by_ProdHouse = df_grouped[df_grouped['ProdHouse'] == i]
        traces.append(go.Bar(
            x=df_by_ProdHouse['Date_Month'],
            y=df_by_ProdHouse['Total_Active_Products'],

            name=i
     )    )
    return {
            'data': traces,
            'layout': dict(
            xaxis={ 'title': 'Active Products Per Month'},
            yaxis={'title': 'Total Active Products'},
            margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
            legend={'x': 0, 'y': 1},
            hovermode='closest',
            transition = {'duration': 500},
    )
}

或者,如果您仍想使用散点图,请将 df['Date_Month'] 和 df['Date_Year'] 从类别转换为具有日期的对象,例如:2020 年 5 月是 2020-05-01。

这可以通过以下示例实现:

import pandas as pd

df = pd.DataFrame({'ProdHouse': ['software 1', 'software 2', 'software 3', 'software 4', 'software 3'],
                          'Date_Year': [2018, 2018, 2018, 2018, 2018], 'Date_Month': ['January', 'February', 'March', 'April', 'May'],'Total_Active_Products':[1,2,7,8,6]})



date_1 ='{}-{}'.format(df['Date_Month'].iloc[0], df['Date_Year'].iloc[0])
date_2 = '{}-{}'.format('June', df['Year'].iloc[4])

df['dates'] = pd.date_range(date_1, date_2, freq='M')
print(df)

由于您现在正在使用对象,请将 isin 替换为以下内容:

filtered_df = df[(pd.to_datetime(df.dates).dt.year>=selected_year_min)& (pd.to_datetime(df.dates).dt.year<=selected_year_max)]

请相应地调整以上代码。它旨在从下拉列表中获取最小和最大年份。

最后,更改散点图中的x输入值,如下所示:

traces.append(go.Scatter(
    x=df_by_ProdHouse['dates'],
    y=df_by_ProdHouse['Total_Active_Products'],
    mode='lines+markers',
    line={
        'color': '#CD5C5C',
        'width': 2},
    marker={
        'color': '#CD5C5C',
        'size': 10,
         'symbol': "diamond-open"
    },
   # marker_line_width=1.5, opacity=0.6,

 )    )



return {
        'data': traces,
        'layout': dict(
        xaxis={ 'title': 'Date', 
               'showticklabels':True,
               'linecolor':'rgb(204, 204, 204)',
               'linewidth':2,
               'ticks':'outside'


               },
        yaxis={'title': 'Total Active Products'},

        margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
        legend={'x': 0, 'y': 1},

        #marker=dict(color='#CD5C5C', size=1,symbol="diamond-open"),
        hovermode='closest',
        transition = {'duration': 500},
        title={
        'text': "Softwares",
        'y':0.9,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'},
        font=dict(
            color="#7f7f7f"
        )

)

}