Medical Staff Allocation for Influenza Season

Project Overview
For this project, two datasets containing population data  and influenza deaths by geography, age, time and gender were used to determine the influenza season in US as well as assess the vulnerability of the population to the virus.
The project is intended to help with efficient distribution of additional medical staff across US by the management team of a medical staffing agency.
Limitations
There was not any PII data present within the dataset,
so I did not face any data privacy issues.
The data about children under 5 was suppressed
and not accessible within the datasets for protection purposes.
Objective
To gain practical proficiency in hypothesis formulation and testing as well as data cleaning, integration, and analysis using different Excel functions.
Tools and Techniques
Data cleaning and analysis was done using Excel.
Data visualization was done in Tableau.
Data Preparation
After careful evaluation of five available, potentially relevant datasets, only the influenza deaths data and population data were found to be well-suited for our intended purposes and other three datasets were incomplete, thus impractical for our analytical purposes. I combined these datasets using VLOOKUP function in Excel after identifying the data grain.
I addressed Inconsistencies and missing values in Excel. All changes made during the data cleansing process were recorded and documented.
Exploratory Analysis
To determine the influenza season, I picked influenza-related death counts to serve as an indicator that shows the cyclical nature of influenza outbreaks. Through building a temporal chart, I discovered a seasonal pattern in the number of influenza deaths which highlights the influenza season.
Adults over 65 were listed among vulnerable populations. Here, I developed and tested null and alternative hypotheses to verify the correlation between the number of vulnerable population and the total number of influenza-related deaths:

Null hypothesis: The number of death counts in states with a larger population over 65 is less than or equal to this number in states with a larger non-vulnerable population.
Alternative hypothesis: The number of death counts is higher in states with a larger population over 65.
Results: The statistical hypothesis testing led to the rejection of the null hypothesis with a 95% confidence level with a P-value of
9.16E-45.
The results of the statistical testing are further confirmed by the accompanying scatterplot.
Where is the problem?
Recommendations and Next Steps
Further analysis of other relevant data, such as data about other vulnerable populations (if available), can be conducted to gain even more credible insights and conclusions.
A priority list of states with larger vulnerable populations can be created based on the results obtained from this analysis to efficiently distribute medical staff across states.
Deliverables
Tableau Dashboard

Want to get in touch?
Drop me a line!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.