ML / Healthcare✓ CompletedSeptember 2022

Heart Disease Detection

A heart disease prediction system built using classical machine learning methods. The pipeline includes robust preprocessing (handling missing values, feature scaling, categorical encoding), followed by supervised classification using Logistic Regression, Random Forest, and K-Nearest Neighbors. Achieved 85% accuracy on the Cleveland Heart Disease dataset with full model comparison and feature importance analysis.

Tech Stack

PythonScikit-learnPandasMatplotlibSeabornLogistic RegressionRandom ForestKNN

Key Highlights

  • 85% classification accuracy on Cleveland Heart Disease dataset
  • Comparative analysis: Logistic Regression vs Random Forest vs KNN
  • Feature importance analysis and SHAP explanations
  • Full preprocessing pipeline with cross-validation