Predictive Healthcare Insights
Hi! I’m Tanmay Sharma, an independent data enthusiast with a passion for data analytics, visualization, and research. I love exploring real-world datasets, building end-to-end workflows using Python, SQL, and Power BI, and sharing actionable insights with the community.
I’m currently focused on financial analysis projects, tech-driven dashboards, and creating BIG Firms-ready portfolios. When I’m not crunching numbers, you’ll find me learning new data tools, exploring AI applications, or posting about independent research projects.
Introduction
Healthcare systems face a major challenge with patient readmissions, leading to increased costs and reduced patient care quality. This project demonstrates an end-to-end workflow to predict readmissions and generate actionable insights using Python, SQL, Azure, and Power BI.
Problem Statement
Hospitals often struggle to identify which patients are at risk of returning after discharge. By predicting readmissions, hospitals can optimize resources, improve patient outcomes, and reduce operational costs.
Workflow Overview
Python Data Pipeline
Generate & clean synthetic healthcare data (
hospital_admissions.csv)Feature selection:
age,length_of_stay,num_lab_tests,total_cost,prev_admissionsTrain-test split with stratified sampling to maintain class balance
ML Pipeline:
SimpleImputer + StandardScaler + RandomForestClassifierPredictions & feature importance saved for dashboard use

SQL Analysis (MySQL)
Average Length of Stay per Patient
Readmission Rate
Cost Analysis
Gender-wise Readmission Rate
Monthly Admissions Trend
4. Patients with High Readmission Risk (Window Function)
SELECT patient_id, COUNT(*) AS total_admissions, SUM(readmitted) AS total_readmissions, ROUND(SUM(readmitted) * 100.0 / COUNT(*), 2) AS readmission_rate, RANK() OVER (ORDER BY SUM(readmitted) DESC) AS risk_rank FROM admissions GROUP BY patient_id ORDER BY readmission_rate DESC LIMIT 10;Monthly Admissions Trend
SELECT DATE_FORMAT(admit_date, '%Y-%m') AS month, COUNT(*) AS admissions, ROUND(SUM(total_cost), 2) AS total_cost FROM admissions GROUP BY month ORDER BY month;Gender-wise Readmission Rate
SELECT p.gender, COUNT(*) AS total_patients, SUM(a.readmitted) AS total_readmitted, ROUND(SUM(a.readmitted) * 100.0 / COUNT(*), 2) AS readmission_rate FROM admissions a JOIN patients p ON a.patient_id = p.patient_id GROUP BY p.gender ORDER BY readmission_rate DESC;Age Group vs Average Length of Stay
SELECT CASE WHEN p.age BETWEEN 0 AND 18 THEN '0-18' WHEN p.age BETWEEN 19 AND 35 THEN '19-35' WHEN p.age BETWEEN 36 AND 50 THEN '36-50' WHEN p.age BETWEEN 51 AND 65 THEN '51-65' ELSE '65+' END AS age_group, ROUND(AVG(a.length_of_stay), 2) AS avg_stay FROM admissions a JOIN patients p ON a.patient_id = p.patient_id GROUP BY age_group ORDER BY avg_stay DESC;Azure Blob Storage
CSV datasets stored securely in the cloud
Enables integration with Power BI for real-time dashboards

Power BI Dashboard
KPI Cards: Total Patients, Readmissions, Avg Length of Stay
Pie/Donut Charts: Readmission vs No Readmission, Gender Distribution
Bar/Line Charts: Readmission trends, Top 5 Costly Diagnoses
Patient-level predictions table with probabilities

Key Insights
Patients with longer stays or previous admissions are more likely to be readmitted
Readmission patterns by gender and cost correlate with length of stay
Interactive dashboards allow hospital management to proactively allocate resources
Conclusion
This project provides a MAANG-level demonstration of healthcare analytics, combining Python, SQL, Azure, and Power BI. It shows how end-to-end analytics pipelines can deliver predictive insights and aid in data-driven decision making in healthcare.Website-Link: https://tanmayportfolio52.wordpress.com/
Git hub- Link: https://github.com/Tanu272004