Skip to content

ibtesaamaslam/Customer-Churn-Prediction-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

image

πŸš€ Customer Churn Prediction Model

Using Artificial Neural Network (ANN)

Python Pandas NumPy Matplotlib TensorFlow Jupyter Scikit-learn License: MIT

πŸ”₯ Predict customer churn before it happens!
A complete end-to-end Artificial Neural Network (ANN) solution built on the Telco Customer Churn dataset β€” with every code block explained for absolute beginners.


πŸ“– Table of Contents


🎯 Project Overview

Customer churn is when a customer stops using a company's services. It is one of the most expensive problems in subscription businesses β€” telecom, SaaS, banking, and beyond. Retaining an existing customer is 5–25Γ— cheaper than acquiring a new one.

This project builds a powerful Artificial Neural Network (ANN) to predict whether a customer will churn (Yes / No). The model leverages 20+ customer features including tenure, monthly charges, contract type, payment method, and service subscriptions.

Why ANN?
Deep neural networks excel at capturing complex non-linear patterns and feature interactions that traditional machine learning models often miss.

Goal: Deliver high precision + recall so businesses can proactively offer retention campaigns to at-risk customers β€” before they leave.


✨ Key Features

  • βœ… Complete end-to-end pipeline (raw CSV β†’ trained ANN)
  • βœ… Professional data cleaning & preprocessing
  • βœ… Handles missing values in TotalCharges
  • βœ… One-hot encoding of all categorical features
  • βœ… Feature scaling for neural network stability
  • βœ… Deep ANN with hidden layers, ReLU activation & Dropout regularization
  • βœ… Training history visualization (accuracy & loss curves)
  • βœ… Ready for hyperparameter tuning
  • βœ… Every code block explained for absolute beginners

πŸ“Š Dataset

File: Customer-Churn.csv (included in repository)
Size: 7,043 customers Γ— 21 columns
Source: Classic Telco Customer Churn dataset

Key Columns

Column Description Type
customerID Unique ID β€” dropped before training String
gender Male / Female Categorical
SeniorCitizen 0 = No, 1 = Yes Binary
Partner Has a partner? Yes/No
Dependents Has dependents? Yes/No
tenure Months with the company Numerical
PhoneService Has phone service? Yes/No
MultipleLines Multiple lines option Categorical
InternetService DSL / Fiber optic / No Categorical
OnlineSecurity Online security subscription Categorical
OnlineBackup Online backup add-on Categorical
DeviceProtection Device protection plan Categorical
TechSupport Tech support subscription Categorical
StreamingTV Streaming TV add-on Categorical
StreamingMovies Streaming movies add-on Categorical
Contract Month-to-month / One year / Two year Categorical
PaperlessBilling Paperless billing enabled? Yes/No
PaymentMethod Electronic check, Mailed check, etc. Categorical
MonthlyCharges Current monthly bill amount Numerical
TotalCharges Total amount billed β€” contains blanks for new customers Numerical
Churn Target variable: Yes = customer left Binary

⚠️ Important Note: TotalCharges contains blank strings for customers with tenure = 0 (brand-new customers). The notebook converts these to numeric and handles them properly using pd.to_numeric(..., errors='coerce') followed by dropna().


πŸ› οΈ Model Architecture

Sequential ANN
β”œβ”€β”€ Input Layer      β†’  26 neurons  (after one-hot encoding + scaling)
β”œβ”€β”€ Hidden Layer 1   β†’  32 neurons  +  ReLU activation  +  Dropout(0.2)
β”œβ”€β”€ Hidden Layer 2   β†’  16 neurons  +  ReLU activation
└── Output Layer     β†’   1 neuron   +  Sigmoid  (outputs churn probability 0–1)
Component Choice Reason
Loss Function Binary Crossentropy Standard for binary classification
Optimizer Adam Adaptive learning rate; fast convergence
Output Activation Sigmoid Produces probability between 0 and 1
Hidden Activation ReLU Avoids vanishing gradient; trains fast
Regularization Dropout (0.2) Prevents overfitting on training data
Metrics Accuracy, Precision, Recall, F1-score Balanced view of model performance

πŸ“˜ Step-by-Step Code Explanation

Every single block from Churn Prediction Model.ipynb is explained below so even a complete beginner can follow along.


1. Title & Introduction (Markdown cell)

# Customer Churn Prediction Model Using Artificial Neural Network (ANN)

Explanation: Just a heading β€” pure documentation. No code runs here. Good practice to always title your notebooks clearly.


2. Import Libraries (Code cell)

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Explanation:

  • pandas β€” Reads CSV files and manipulates tabular data (think: Excel in Python).
  • numpy β€” Fast math and array operations that pandas relies on under the hood.
  • matplotlib.pyplot β€” Creates professional plots and charts.

Tip: These three libraries are the foundation of almost every data science project in Python.


3. Loading the Dataset (Code cell)

df = pd.read_csv(r"C:\Customer-Churn.csv")

Explanation: Reads the CSV file from disk into a Pandas DataFrame named df. The r prefix before the file path treats backslashes as raw characters β€” essential for Windows paths. After this line, all 7,043 customer records are in memory and ready to work with.


4. Quick Peek at Data (Code cell)

df.sample(5)

Explanation: Displays 5 randomly selected rows so you can eyeball the data β€” see what columns look like, what values exist, and do a sanity check before proceeding.


5. Data Cleaning β€” Drop Useless Column (Code cell)

df.drop('customerID', axis='columns', inplace=True)

Explanation:

  • customerID is unique per row and carries zero predictive signal.
  • axis='columns' tells pandas to drop a column (not a row).
  • inplace=True modifies df directly instead of creating a copy.

After this step, we're left with 20 meaningful features.


6. Confirm Column Removal (Code cell)

df.sample(5)

Explanation: Another quick peek to confirm customerID is gone. Always verify your cleaning steps!


7. Handling TotalCharges β€” Critical Fix! (Code cell)

pd.to_numeric(df.TotalCharges, errors='coerce').isnull()

Explanation:

  • Some rows have blank strings ("") in TotalCharges β€” these belong to customers with tenure = 0 (brand-new, never billed).
  • pd.to_numeric(..., errors='coerce') attempts conversion to a number; anything that fails becomes NaN.
  • .isnull() returns a boolean mask showing which rows have the problem.

Full fix applied in the notebook:

df['TotalCharges'] = pd.to_numeric(df['TotalCharges'], errors='coerce')
df.dropna(subset=['TotalCharges'], inplace=True)

8. Encoding Categorical Features (Code cell)

df = pd.get_dummies(df, drop_first=True)

Explanation: Neural networks require numerical inputs β€” they cannot work with strings like "Male" or "Month-to-month". pd.get_dummies() converts every categorical column into binary 0/1 indicator columns (one-hot encoding). drop_first=True removes one redundant column per category to avoid multicollinearity.


9. Encoding the Target Variable (Code cell)

df['Churn'] = df['Churn'].map({'Yes': 1, 'No': 0})

Explanation: Converts the target column from text (Yes/No) to integers (1/0) so the neural network can compute loss and gradients against it.


10. Splitting Features & Target (Code cell)

X = df.drop('Churn', axis='columns')
y = df['Churn']

Explanation:

  • X = all input features (everything except the target).
  • y = the target column (what we're trying to predict).

11. Train/Test Split (Code cell)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Explanation: Splits data into 80% training and 20% testing. random_state=42 ensures reproducibility β€” you'll get the same split every time you run the notebook.


12. Feature Scaling (Code cell)

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test  = scaler.transform(X_test)

Explanation: Neural networks are sensitive to the scale of inputs. StandardScaler standardizes each feature to have mean = 0 and standard deviation = 1. Key rule: fit only on training data, then transform both train and test β€” this prevents data leakage.


13. Building the ANN (Code cell)

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(16, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

Explanation:

  • Sequential = layers stacked one after another.
  • Dense(32, activation='relu') = 32 neurons, each connected to every input, using ReLU activation.
  • Dropout(0.2) = randomly deactivates 20% of neurons during each training step to prevent overfitting.
  • Dense(1, activation='sigmoid') = output layer producing a probability between 0 and 1.

14. Compiling the Model (Code cell)

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

Explanation: compile() configures the learning process. Adam optimizer adapts the learning rate automatically. Binary crossentropy is the standard loss for yes/no classification problems.


15. Training the Model (Code cell)

history = model.fit(
    X_train, y_train,
    epochs=50,
    batch_size=32,
    validation_data=(X_test, y_test)
)

Explanation: Trains the ANN for 50 passes over the training data. batch_size=32 means the model updates weights after every 32 samples. validation_data lets us monitor performance on unseen data at each epoch.


16. Visualizing Training History (Code cell)

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8))

ax1.plot(history.history['accuracy'], label='Train')
ax1.plot(history.history['val_accuracy'], label='Validation')
ax1.set_title('Model Accuracy')
ax1.set_ylabel('Accuracy')
ax1.set_xlabel('Epoch')
ax1.legend(loc='upper left')

ax2.plot(history.history['loss'], label='Train')
ax2.plot(history.history['val_loss'], label='Validation')
ax2.set_title('Model Loss')
ax2.set_ylabel('Loss')
ax2.set_xlabel('Epoch')
ax2.legend(loc='upper right')

plt.tight_layout()
plt.show()

Explanation:

  • history stores accuracy and loss recorded after each epoch.
  • We create two subplots stacked vertically:
    • Top: Accuracy over time for both train and validation sets.
    • Bottom: Loss (error) over time for both sets.
  • Watching both curves together helps diagnose overfitting (training accuracy rises while validation drops).
  • plt.tight_layout() prevents labels from overlapping.

17. Evaluating the Model (Code cell)

from sklearn.metrics import classification_report, confusion_matrix

y_pred = (model.predict(X_test) > 0.5).astype(int)
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))

Explanation:

  • model.predict() outputs probabilities (e.g., 0.73).
  • > 0.5 converts to binary predictions: above 0.5 β†’ churn, below β†’ no churn.
  • confusion_matrix shows true positives, false positives, true negatives, false negatives.
  • classification_report prints precision, recall, and F1-score for both classes.

πŸš€ Installation & How to Run

Step 1 β€” Clone the repository

git clone <https://github.com/ibtesaamaslam/Customer-Churn-Prediction-Model>
cd Customer-Churn-Prediction-ANN

Step 2 β€” Create a virtual environment (recommended)

python -m venv venv
venv\Scripts\activate        # Windows
source venv/bin/activate     # macOS / Linux

Step 3 β€” Install dependencies

pip install -r requirements.txt

Step 4 β€” Launch Jupyter Notebook

jupyter notebook

Open Churn Prediction Model.ipynb and run all cells. πŸŽ‰

Note: Customer-Churn.csv is already included in the repository β€” no additional download needed.


πŸ“ˆ Results & Visualizations

Training curves show accuracy steadily increasing and loss decreasing over 50 epochs, indicating stable learning without severe overfitting.

Typical performance on this dataset:

Metric Score
Accuracy ~80–82%
Precision (churn class) ~75%+
Recall (churn class) ~70%+
F1-Score (churn class) ~72%+

Accuracy and loss plots are generated directly inside the notebook as interactive Matplotlib figures.


πŸ”§ Dependencies

requirements.txt

pandas
numpy
matplotlib
tensorflow>=2.10
scikit-learn
jupyter
seaborn

Install all at once with:

pip install -r requirements.txt

πŸ’‘ Business Impact

Why does this model matter in the real world?

  • Cost savings β€” Target only high-risk customers with discounts or retention campaigns instead of blanket offers to everyone.
  • Revenue protection β€” Reducing churn by even 5–10% can translate to significant recurring revenue gains.
  • Actionable insights β€” The model highlights which features most strongly drive churn, such as month-to-month contracts, fiber optic service, and high monthly charges, giving product and CX teams clear levers to pull.
  • Proactive vs reactive β€” Shift from reacting to cancellations to preventing them before the customer decides to leave.

πŸ“Œ Future Improvements

  • Hyperparameter tuning with Keras Tuner or GridSearchCV
  • Model explainability with SHAP or LIME
  • Handle class imbalance using SMOTE or class weights
  • Deploy as a web app using FastAPI + Streamlit
  • Benchmark against XGBoost and LightGBM
  • Add cross-validation for more robust evaluation

🀝 Contributing

Contributions are very welcome!

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature-name)
  3. Commit your changes (git commit -m 'Add: your feature description')
  4. Push to the branch (git push origin feature/your-feature-name)
  5. Open a Pull Request

Please open an issue first to discuss major changes. All contributions β€” bug fixes, documentation, new features β€” are appreciated. ❀️


πŸ“„ License

Distributed under the MIT License. See LICENSE for full details.


Made with ❀️ for the data science community ⭐ If you found this project helpful, please consider giving it a star!

About

πŸ“‰ Predicts customer churn using an ANN on 7,043 Telco records β€” full pipeline from raw CSV to confusion matrix. Built with TensorFlow, Keras & Scikit-learn.

Topics

Resources

License

Stars

Watchers

Forks

Contributors