Authors: Deepa Barethiya, Deepak Vinod Chouksey, Ankur Sanjeev Khurpadi
Abstract: Cardiovascular diseases remain the leading cause of mortality worldwide, accounting for approximately 17.9 million deaths annually according to the World Health Organization. Early detection and accurate risk assessment of heart disease are critical for effective clinical intervention and improved patient outcomes. Traditional diagnostic methods often depend heavily on subjective clinical judgment, which can be inconsistent and time-consuming. This research proposes a Machine Learning-based predictive system that leverages clinical data to assess the risk of heart disease with high accuracy. The proposed system employs multiple classification algorithms including Logistic Regression, Random Forest, Support Vector Machine (SVM), and XGBoost, and evaluates their performance on the UCI Cleveland Heart Disease dataset. Feature selection techniques such as correlation analysis and Recursive Feature Elimination (RFE) are used to identify the most significant clinical predictors. The proposed ensemble model achieves an accuracy of 91.8%, sensitivity of 93.2%, and specificity of 90.4%, outperforming individual classifiers. The results demonstrate that machine learning can serve as a reliable and scalable decision-support tool for cardiologists and general physicians in early heart disease diagnosis.