A machine learning project to predict the likelihood of a heart attack based on patient health data. This repository is centered around an interactive IPython (Jupyter) Notebook, which demonstrates data analysis, model building, and evaluation for heart attack prediction.
Heart diseases are among the leading causes of death worldwide. Early prediction and prevention are crucial. This project provides a practical demonstration of how machine learning techniques can be used to analyze patient data and predict the risk of heart attack.
All code and analysis are contained in a single Jupyter Notebook. The notebook typically covers:
- Data Loading: Importing the heart disease dataset (e.g., UCI Heart Disease Dataset).
- Exploratory Data Analysis (EDA): Visualizations and statistical summaries to understand patterns and correlations.
- Data Preprocessing: Cleaning, handling missing values, encoding categorical variables, and scaling.
- Model Building: Applying machine learning models (such as Logistic Regression, Random Forest, SVM, etc.) to predict heart attack risk.
- Evaluation: Assessing model performance with metrics like accuracy, precision, recall, F1-score, and ROC-AUC.
- Visualization: Displaying results through plots and charts for better understanding.
-
Clone the repository
git clone https://github.com/ChinmayPande48/heart-attack-prediction.git cd heart-attack-prediction -
Install dependencies
- It's recommended to use Anaconda or a virtual environment.
- Install required packages using pip:
pip install -r requirements.txt
- Or, manually install the main packages (as used in the notebook):
pip install numpy pandas matplotlib seaborn scikit-learn jupyter
-
Launch Jupyter Notebook
jupyter notebook
-
Open the notebook file (e.g.,
heart_attack_prediction.ipynb) in your browser. -
Follow the sections in the notebook:
- Run cells sequentially for data analysis, model training, and evaluation.
- Modify parameters or try different models as desired.
The notebook summarizes the performance of each machine learning model used and displays relevant charts (such as confusion matrices, ROC curves, or feature importances). Example metrics include:
| Model | Accuracy | Precision | Recall | F1-Score | ROC-AUC |
|---|---|---|---|---|---|
| Logistic Regression | 0.85 | 0.84 | 0.83 | 0.84 | 0.90 |
| Random Forest | 0.88 | 0.87 | 0.85 | 0.86 | 0.92 |
Please refer to the notebook output for detailed results and visualizations.
Contributions are welcome! If you have suggestions for improvements, feel free to fork the repository and submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
For questions or suggestions, please open an issue!