Skip to main content

Command Palette

Search for a command to run...

Predictive Healthcare Insights

Published
3 min read
T

Hi! I’m Tanmay Sharma, an independent data enthusiast with a passion for data analytics, visualization, and research. I love exploring real-world datasets, building end-to-end workflows using Python, SQL, and Power BI, and sharing actionable insights with the community.

I’m currently focused on financial analysis projects, tech-driven dashboards, and creating BIG Firms-ready portfolios. When I’m not crunching numbers, you’ll find me learning new data tools, exploring AI applications, or posting about independent research projects.

Introduction
Healthcare systems face a major challenge with patient readmissions, leading to increased costs and reduced patient care quality. This project demonstrates an end-to-end workflow to predict readmissions and generate actionable insights using Python, SQL, Azure, and Power BI.

Problem Statement
Hospitals often struggle to identify which patients are at risk of returning after discharge. By predicting readmissions, hospitals can optimize resources, improve patient outcomes, and reduce operational costs.

Workflow Overview

  1. Python Data Pipeline

    • Generate & clean synthetic healthcare data (hospital_admissions.csv)

    • Feature selection: age, length_of_stay, num_lab_tests, total_cost, prev_admissions

    • Train-test split with stratified sampling to maintain class balance

    • ML Pipeline: SimpleImputer + StandardScaler + RandomForestClassifier

    • Predictions & feature importance saved for dashboard use

  2. SQL Analysis (MySQL)

    • Average Length of Stay per Patient

    • Readmission Rate

    • Cost Analysis

    • Gender-wise Readmission Rate

    • Monthly Admissions Trend

  3. 4. Patients with High Readmission Risk (Window Function)

     SELECT patient_id,
            COUNT(*) AS total_admissions,
            SUM(readmitted) AS total_readmissions,
            ROUND(SUM(readmitted) * 100.0 / COUNT(*), 2) AS readmission_rate,
            RANK() OVER (ORDER BY SUM(readmitted) DESC) AS risk_rank
     FROM admissions
     GROUP BY patient_id
     ORDER BY readmission_rate DESC
     LIMIT 10;
    

    Monthly Admissions Trend

     SELECT DATE_FORMAT(admit_date, '%Y-%m') AS month,
            COUNT(*) AS admissions,
            ROUND(SUM(total_cost), 2) AS total_cost
     FROM admissions
     GROUP BY month
     ORDER BY month;
    

    Gender-wise Readmission Rate

     SELECT p.gender,
            COUNT(*) AS total_patients,
            SUM(a.readmitted) AS total_readmitted,
            ROUND(SUM(a.readmitted) * 100.0 / COUNT(*), 2) AS readmission_rate
     FROM admissions a
     JOIN patients p ON a.patient_id = p.patient_id
     GROUP BY p.gender
     ORDER BY readmission_rate DESC;
    

    Age Group vs Average Length of Stay

     SELECT CASE 
                WHEN p.age BETWEEN 0 AND 18 THEN '0-18'
                WHEN p.age BETWEEN 19 AND 35 THEN '19-35'
                WHEN p.age BETWEEN 36 AND 50 THEN '36-50'
                WHEN p.age BETWEEN 51 AND 65 THEN '51-65'
                ELSE '65+'
            END AS age_group,
            ROUND(AVG(a.length_of_stay), 2) AS avg_stay
     FROM admissions a
     JOIN patients p ON a.patient_id = p.patient_id
     GROUP BY age_group
     ORDER BY avg_stay DESC;
    

    Azure Blob Storage

    • CSV datasets stored securely in the cloud

    • Enables integration with Power BI for real-time dashboards

  1. Power BI Dashboard

    • KPI Cards: Total Patients, Readmissions, Avg Length of Stay

    • Pie/Donut Charts: Readmission vs No Readmission, Gender Distribution

    • Bar/Line Charts: Readmission trends, Top 5 Costly Diagnoses

    • Patient-level predictions table with probabilities

    • Key Insights

      • Patients with longer stays or previous admissions are more likely to be readmitted

      • Readmission patterns by gender and cost correlate with length of stay

      • Interactive dashboards allow hospital management to proactively allocate resources

  1. Conclusion
    This project provides a MAANG-level demonstration of healthcare analytics, combining Python, SQL, Azure, and Power BI. It shows how end-to-end analytics pipelines can deliver predictive insights and aid in data-driven decision making in healthcare.

  2. Website-Link: https://tanmayportfolio52.wordpress.com/

  3. Git hub- Link: https://github.com/Tanu272004