This project implements core Machine Learning algorithms entirely from scratch using only Python and NumPy, without using scikit-learn for model training.
It is built to demonstrate:
- Strong understanding of ML fundamentals
- Research-level implementation skills
- Internship and placement readiness
- A high-quality GitHub portfolio project
| Algorithm | Type | Implemented From Scratch |
|---|---|---|
| Linear Regression | Regression | Yes |
| Logistic Regression | Classification | Yes |
| K-Nearest Neighbors (KNN) | Classification & Regression | Yes |
| Decision Tree (CART) | Classification | Yes |
| Random Forest | Ensemble Learning | Yes |
- Custom Train-Test Split implementation
- Custom Evaluation Metrics
- Full Machine Learning Pipeline System
- Custom Standard Scaler
- No dependency on
scikit-learnfor training - Data visualization using Matplotlib
- Clean, modular, and scalable project structure
ml-from-scratch/ │ ├── data/ │ ├── linear_data.csv │ ├── logistic_data.csv │ └── tree_data.csv │ ├── models/ │ ├── linear_regression.py │ ├── logistic_regression.py │ ├── knn.py │ ├── decision_tree.py │ └── random_forest.py │ ├── utils/ │ ├── train_test_split.py │ ├── metrics.py │ ├── scaler.py │ └── pipeline.py │ ├── main.py ├── run_knn.py ├── run_tree.py ├── run_random_forest.py ├── run_pipeline.py └── README.md
Activate virtual environment:
venv\Scripts\activate
Run different models:
python main.py
python run_knn.py
python run_tree.py
python run_random_forest.py
python run_pipeline.py
This repository demonstrates core machine learning engineering skills, focusing on mathematical correctness, algorithmic clarity, and clean system design, rather than simply using pre-built libraries.