I do, Dash

Data Analysis

·

2 min read

I was tasked to create a tool to analyse and visualise results, collected from multiple organisations. The data included hundreds of questions and respective answers stored in dozens of spreadsheets. My objectives were to create an interactive dashboard.

Data ingestion

As mentioned before, raw data were stored in spreadsheet forms. read_excel function of pandas data analysis package was utilised for ingesting the surveys. There are a few useful things I learnt when using read_excel.

  • sheet_name:[List(int)] ~ request multiple sheets from a spreadsheet. Like anything order in Python, the sheet_name is 0-based index a.k.a. the first sheet is named 0.

  • usecols:[String] ~ a string pattern for specifying considered columns e.g. usecols='A:D' or usecols='A,B,C:D' select columns A,B,C and D.

  • skiprows:[List(int)] ~ skip header rows

  • dtype:[Object] ~ specify format of ingested data. This param needs specifying to avoid any ambiguity of input data types. When this param left unspecified, pandas will infer data types of columns from their respective values.

Noted, there were multiple warnings UserWarning: Data Validation extension is not supported and will be removed, automatically generated when a read_excel was called.

I have not profiled the application properly. I suspected I/O processes of reading these spreadsheets might be a bottleneck if the dashboard needs scaling out i.e. handling more than dozens of surveys.

Data Visualisation

Summary

It is often desirable to have a summary for each column on a barchart. Dash / Plotly provides add_annotation for Figure object to help dashboard developer add informative annotations. Bellow

fig.add_annotation(
  dict(
    font=dict(color='green',size=14), # font format
    showarrow=False, # disable arrow
    bgcolor='rgba(240,240,240,255)', # set background color for the annotation
    x=k, # numerical or categorical value
    y=1.05, # y-coordinate w.r.t. to yref [0~bottom, 1~top]
    text='your annotation', # add text annotation
    yref="paper", # using the paper frame
    xref="x", # using the current x axis
  )
)