Dash by Plotly

Let’s say you have been working on a project for clients segmentation. You have your client segments well separated and your final task is to present findings and results to the project stakeholders. Usual situation is that none of them have that level of technical expertise to understand your code so you need to visualize your work as low level as possible. Something that can really help you for that task is Dash (and Plot.ly).

Dash is a productive Python framework for building web applications. Being written on top of Flask, Plotly.js and React.js, makes Dash ideal for building data visualization apps. It is particularly suited for anyone who works with data in Python.

Dash apps are rendered in the web browser. You can deploy your apps to servers and then share them through URLs. Since Dash apps are viewed in the web browser, Dash is inherently cross-platform and mobile ready.

You can find numerous examples of Dash apps at the Dash App Gallery .

To install Dash, simply run following commands in Terminal:

pip install dash==0.39.0  # The core dash backend
pip install dash-daq==0.1.0  # DAQ components (newly open-sourced!)

First thing you need to understand is that Dash apps are composed of two parts. First part is the layout of the app and it describes what the application looks like. The second part describes the interactivity of the application.

Firstly, I will explain the layout part. Dash provides Python classes for all of the visual components of the application. Sets of the components are maintained in the dash_core_components and the dash_html_components libraries. You can build your own components with JavaScript and React.js, as well.

The dash_html_components library provides classes for all of the HTML tags and the keyword arguments describe the HTML attributes like style, className, and id. On the other hand, dash_core_components library generates higher-level components like controls and graphs.

You can find more detailed and practical explanations of components libraries at the following links:

As mentioned earlier, second part of Dash app that is responsible for the application’s interactivity is specified through @app.callback decorator. This decorator takes Output and Input arguments. In Dash, the inputs and outputs of our application are simply the properties of a particular component.

Additionally, you can specify the State by using dash.dependencies.State. This allows you to pass along extra values without firing the callbacks. For more information on State go to the official documentation.

Now let’s build a simple single-page Dash app. You can find the whole project on the following link.

After importing libraries and dataset, we did necessary data preprocessing and Plotly charts definitions.

Now comes the Dash part. Dash apps are web applications. Dash uses Flask as the web framework. The underlying Flask app is available at app.server or you can pass your own Flask app instance into Dash. In our example, we used the second option and passed our Flask app instance.

server = flask.Flask('main_app')
server.secret_key = os.environ.get('secret_key', 'secret')

external_stylesheets = [
    'https://codepen.io/chriddyp/pen/bWLwgP.css',
    {
        'href': 'https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css',
        'rel': 'stylesheet',
        'integrity': 'BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u',
        'crossorigin': 'anonymous'
    }
]

app = dash.Dash('__name__', server=server, external_stylesheets=external_stylesheets)

After setting up app, we need to define all the components our dashboard will consist of. Components are specified through the lists that are then assigned to children property of our main html.Div component. It can be just one list or many of them. In our example, I defined main list called children_list and appended radar_plots sublist to the last html.Div component of children_list.

After specifying children_list we assign it to the children property of our main html.Div that will be base section of our app.layout.

app.layout = html.Div(children=children_list)

For every interactive part of our dashboard we need to define @app.callback. Component that requires refreshing of the dashboard is our Clients dataset table. Things that make it interactive are filtering, sorting and pagination options. These require @app.callback and function we called update_table() that takes pagination_settings, sorting_settings and filtering_settings as inputs.

dash_app.py code:

# import libraries
import pandas as pd
import numpy as np
import dash
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html
import dash_table
import flask
import os
import plotly.graph_objs as go

# set display options
pd.set_option('display.MAX_ROWS', 500)
pd.set_option('display.MAX_COLUMNS', 500)
pd.set_option('display.float_format', lambda x: '%.3f' % x)

# import data
df = pd.read_csv('../input_files/segmentation_data_and_clv_fake.csv')

# preprocessing data
df = df.rename(str.lower, axis='columns')
df['cluster'] = df['cluster'].astype('str')
df['clv'] = np.round(df['clv'], 2)
df.drop(['observation_date', 'birth_dt'], axis=1, inplace=True)

for i in list(df.columns):
    df[i] = df[i].replace('"', '')

# measures on a cluster level
df_agg = df[['cluster_label', 'cluster', 'recency', 'frequency', 'monetary_value', 'tenure', 'clv', 'age']].groupby(
    by=['cluster_label', 'cluster'], as_index=False).agg(
    ['count', 'mean', 'std', 'min', 'median', 'max']).sort_values(
    by=['cluster_label', 'cluster'], ascending=False)

df_agg.columns = ['_'.join(x) for x in df_agg.columns.ravel()]
df_agg.reset_index(inplace=True)

df_agg_mean = df_agg[['cluster_label', 'cluster', 'recency_mean', 'frequency_mean', 'monetary_value_mean',
                       'tenure_mean', 'clv_mean', 'age_mean']]

df_agg_mean_t = df_agg_mean.T
df_agg_mean_t.columns = ['regular', 'promising', 'premium', 'needing_attention', 'dormant', 'about_to_sleep']
df_agg_mean_t = df_agg_mean_t.iloc[2:-1, :]

# centroids overview table
df_agg_mean_centroids = df_agg_mean_t.copy()
df_agg_mean_centroids.reset_index(inplace=True)
df_agg_mean_centroids.rename(columns={'index':'centroid_parameter'}, inplace=True)

# column lists
df_agg_mean_t_cols = ['regular', 'promising', 'premium', 'needing_attention', 'dormant', 'about_to_sleep']
df_cols = list(df.columns)
df_agg_mean_centroids_cols = list(df_agg_mean_centroids.columns)

PAGE_SIZE = 10

# Plotly
# CLIENT SEGMENTS
# pie chart for clients segments
labels = np.sort(df['cluster_label'].unique())
df_cl_label = df['cluster_label'].value_counts().to_frame().sort_index()
value_list = df_cl_label['cluster_label'].tolist()

trace = go.Pie(labels=labels,
               values=value_list,
               marker=dict(
                   colors=['rgb(42,60,142)', 'rgb(199,119,68)', 'rgb(91,138,104)', 'rgb(67,125,178)', 'rgb(225,184,10)',
                           'rgb(165,12,12)'])
               )
data = [trace]
layout = go.Layout(title='Client segments')
pie_fig = go.Figure(data=data, layout=layout)

# scatterpolar charts for centroids
radar_plots = []

for feat in df_agg_mean_t.index:
    data = [go.Scatterpolar(
        r=df_agg_mean_t.loc[feat, :].values,
        theta=df_agg_mean_t.columns,
        fill='toself',
        name=feat
    )]

    layout = go.Layout(title=feat + ' by segment')
    radar_fig = go.Figure(data=data, layout=layout)

    radar_plots.append(
        html.Div(className='col-sm-4',
                 children=[
                     dcc.Graph(
                         figure=radar_fig
                     )
                 ]
                 )
    )

# Flask app instance
server = flask.Flask('main_app')
server.secret_key = os.environ.get('secret_key', 'secret')

external_stylesheets = [
    'https://codepen.io/chriddyp/pen/bWLwgP.css',
    {
        'href': 'https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css',
        'rel': 'stylesheet',
        'integrity': 'BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u',
        'crossorigin': 'anonymous'
    }
]

app = dash.Dash('__name__', server=server, external_stylesheets=external_stylesheets)

app.scripts.config.serve_locally = False
dcc._js_dist[0]['external_url'] = 'https://cdn.plot.ly/plotly-basic-latest.min.js'

children_list = [
    html.Div(className='mat-card', style={"display": "block", "margin": "15px"},
             children=[
                 html.H1(children='Segmentation dashboard')
             ]),

    html.Div(className='mat-card', style={"display": "block", "margin": "15px"},
             children=[
                 html.P('Filtering supports equals: eq, greater than: >, and less than: < operations. '
                        'In filter field type e.g.: eq "Adam Vladić" and for numerical columns: eq 34 or > 500')
                 ]),

    html.Div(className='mat-card', style={"display": "block", "margin": "15px"},
             children=[
                 html.H4(children='Clients dataset'),
                 dash_table.DataTable(
                     id='cust-table2',
                     columns=[{'name': i, 'id': i} for i in df_cols],
                     style_cell_conditional=[{'if': {'row_index': 'odd'}, 'backgroundColor': 'rgb(248, 248, 248)'}],
                     style_header={'backgroundColor': 'white', 'fontWeight': 'bold'},
                     pagination_settings={
                         'current_page': 0,
                         'page_size': PAGE_SIZE
                     },
                     pagination_mode='be',
                     filtering='be',
                     filtering_settings='',
                     sorting='be',
                     sorting_type='multi',
                     sorting_settings=[]
                 )
             ]),

    html.Div(className='mat-card', style={"display": "block", "margin": "15px"},
             children=[
                 html.H4(children='Centroids overview'),
                 dash_table.DataTable(
                     id='cust-table',
                     columns=[{'name': i, 'id': i} for i in df_agg_mean_centroids_cols],
                     data=df_agg_mean_centroids.to_dict("rows"),
                     style_cell_conditional=[{'if': {'row_index': 'odd'}, 'backgroundColor': 'rgb(248, 248, 248)'}],
                     style_header={'backgroundColor': 'white', 'fontWeight': 'bold'}
                 )
             ]),

    html.Div(className='mat-card', style={"display": "block", "margin": "15px"},
             children=[
                 html.H4(children='Segments size'),
                 dcc.Graph(
                     figure=pie_fig
                 )
             ]),

    html.Div(className='mat-card row', style={"display": "block", "margin": "15px"},
             children=[
                 html.H4(children='Variables overview')
             ]+radar_plots)
]


app.layout = html.Div(children=children_list)


@app.callback(
    Output('cust-table2', 'data'),
    [
     Input('cust-table2', 'pagination_settings'),
     Input('cust-table2', 'sorting_settings'),
     Input('cust-table2', 'filtering_settings')
    ])
def update_table(pagination_settings, sorting_settings, filtering_settings):
    inter_df = df

    filtering_expressions = filtering_settings.split(' && ')
    dff = inter_df
    for filter in filtering_expressions:
        if ' eq ' in filter:
            col_name = filter.split(' eq ')[0]
            filter_value = filter.split(' eq ')[1]
            col_name = col_name.replace('"', '')
            filter_value = filter_value.replace('"', '')
            for i in list(df.columns):
                df[i] = df[i].astype(str)
            dff = dff.loc[dff[col_name] == filter_value]
        if ' > ' in filter:
            col_name = filter.split(' > ')[0]
            filter_value = float(filter.split(' > ')[1])
            col_name = col_name.replace('"', '')
            # filter_value = filter_value.replace('"', '')
            dff = dff.loc[dff[col_name] > filter_value]
        if ' < ' in filter:
            col_name = filter.split(' < ')[0]
            filter_value = float(filter.split(' < ')[1])
            col_name = col_name.replace('"', '')
            # filter_value = filter_value.replace('"', '')
            dff = dff.loc[dff[col_name] < filter_value]

    if len(sorting_settings):
        dff = dff.sort_values(
            [col['column_id'] for col in sorting_settings],
            ascending=[
                col['direction'] == 'asc'
                for col in sorting_settings
            ],
            inplace=False
        )
    dff = dff.loc[:, df_cols]
    return dff.iloc[
           pagination_settings['current_page'] * pagination_settings['page_size']:
           (pagination_settings['current_page'] + 1) * pagination_settings['page_size']
           ].to_dict('rows')


if __name__ == '__main__':
    app.run_server(debug=True)

 

When you run the whole script you should get the following output:

/Users/jglisovic/.conda/envs/dash_plotly_tutorial/bin/python /Users/jglisovic/Documents/PycharmProjects/
dash_plotly_tutorial/notebooks/dash_plotly_tutorial.py

Running on http://127.0.0.1:8050/

Debugger PIN: 963-424-049
 * Serving Flask app "main_app" (lazy loading)
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: on

Running on http://127.0.0.1:8050/
Debugger PIN: 817-124-319

 

Click on the http://127.0.0.1:8050/ and it will open our Dash application in your web browser.

Below are the snapshots of the Dash app.

 

Beside having dash_app.py script, I also used assets.css file to visually style dashboard. Here is the content of that file:

body{
    background-color: #f0f0f0;
}
button.previous-page {
    margin-top: 16px;
}
h1{
    margin: 0;
    text-align: center;
}
h2 {
    margin-top: 0px;
}
th, td {
    border: none;
    box-shadow: none!important;
}
.mat-card{
    transition: box-shadow 280ms cubic-bezier(.4,0,.2,1);
    display: inline-block;
    position: relative;
    padding: 16px;
    border-radius: 4px;
    box-shadow: 0 2px 1px -1px rgba(0,0,0,.2), 0 1px 1px 0 rgba(0,0,0,.14), 0 1px 3px 0 rgba(0,0,0,.12);
    background: #fff;
    color: rgba(0,0,0,.87);
}
.sort {
    margin-top: -4px;
    display: inline-block;
    text-decoration: none!important;
    float: right!important;
    margin-left: 5px;
}
.dash-spreadsheet-container th:hover .sort, .sort:hover{
    color: rgb(31, 119, 180)!important;
}
.dash-fixed-content {
    outline: 1px lightgrey solid;
}

 

Here, we have seen what it takes to make simple single-page Dash app. Note that it can be made a lot more complex and with more pages and components, either static or interactive ones. You can present almost all your important project outputs through Dash apps and share it to the business stakeholders to play with them.

I hope this post was helpful to you and that it sparked your curiosity for developing insightful, yet simple dashboards that will be used by not only data scientists but also business people who do not understand or do not need to understand all data science/coding/maths/statistics stuff you did in order to deliver the business requirement.

Till next time, keep on learning and things solving! 🙂