Comparative analysis of machine learning techniques to a dataset of demographic, physiological, and lifestyle factors to both predict the likelihood of stroke and identify which factors contribute most to stroke risk.
This repository contains the project data (including the train–test split), along with the Jupyter notebooks used for exploratory data analysis, model building, hyperparameter tuning, and evaluation. It also includes a folder of final output images generated during the exploration and analysis.
- Python version:
3.11.4 - Notebook environment: Jupyter Notebook / VS Code with Jupyter Extension
- Key packages:
pandas,numpy,scikit-learn,PyTorch,Matplotlib,Seaborn - Operating system: Cross-platform (Windows, macOS, Linux). The notebooks use standard Python libraries and relative file paths, so they should run on any system with the required packages installed.
Stroke-Risk-Prediction/
├── README.md
├── final_report.pdf
├── Notebooks/
│ ├── preprocessing.ipynb # Data cleaning, feature engineering, train/test split
│ ├── EDA.ipynb # Exploratory data analysis and visualizations
│ ├── Logisticregression.ipynb # Logistic regression model training
│ ├── RandomForest_Train.ipynb # Random Forest model training and hyperparameter tuning
│ ├── SVC_PCA.ipynb # Support Vector Classifier with PCA
│ ├── nn_train.ipynb # Neural network model training
│ └── Final_Model.ipynb # Final model evaluation and comparison
├── Data/
│ ├── healthcare-dataset-stroke-data.csv # Original dataset
│ ├── X_train.csv # Preprocessed training features
│ ├── X_test.csv # Preprocessed test features
│ ├── y_train.csv # Training labels
│ └── y_test.csv # Test labels
└── Outputs/
├── age_by_stroke.png
├── age_distribution.png
├── avg_glucose_distribution.png
├── bmi_by_stroke.png
├── bmi_distribution_raw.png
├── correlation_heatmap.png
├── final_model_confusion_matrix.png
├── final_model_feature_importance.png
├── random_forest_decision_tree.png
├── stroke_by_gender.png
├── stroke_by_smoking_status.png
├── stroke_by_work_type.png
└── stroke_distribution.png