Databricks Incident Analysis
Refer to https://git.kalar.codes/NickKalar/Incident_Report_Generator for more context.
After using pure Python and Pandas to create an Excel Spreadsheet to compute analytics for a CSV
file of Support Incidents, I decided that the application I built was too cumbersome and not easily
extendable. It would require a lot of fine tuning, and trial and error to do so much as add a new
table or graph. Using Databricks, most if not all of the same analytics can be created in a simple
and extensible way so that even the least technically savvy can add or manipulate data as needed.
How to use
In Databricks, create a catalog named analytics, inside that create a schema named incidents,
and inside that create a volume named current. Upload the CSV file here, or create your own with
the same headers as the file here.
Once completed, import the notebook and run the first cell to get the majority of the analytical
work down. If the source system can be scheduled to dump a new CSV of incidents into an S3 or
similar storage system, this could be updated to run on the same schedule (offset by a few minutes
or more) and alert stakeholders on success or the developer/admin on failure.