Insurance Cross-Selling Prediction

This repo contains my graduation project from Purwadhika Startup and Coding School, entitled "Application of Data Science and Machine Learning in Insurance Cross-Selling Prediction"

Background

In an insurance company, cross-selling is a kind of effort to ensure the sustainability of the company. Moreover, in an risk-sharing based investment, a success in the cross-selling process would deliver multiplier return that would help very much to boost company's growth. However, the worst case of cross-selling faliure will make the customer cancel all of his/her purchase, both on the initial and the offered products. That's why, the company needs a reliable and well-planned strategy to determine how likely the customer will accept the cross-selling offer.

How Could Data Science and Machine Learning Contribute in This Kind of Situation?

Apply machine learning algorithm to predict the customers who are more likely to accept the offers.
Predict the likelihood (probability output) of an unknown customer to accept the offers.

The probability output of any classification algorithm can be used by the marketing team to determine what kind of approach they should choose to offer the product to the customer. For instance, if they found the likelihood is quite low, they can still gamble to offer the product to the customer by using a less-persuasive approach, without risking the initial purchase the customer agreed earlier. On the contrary, if they found the likelihood is relatively high, they can directly approach the customer through their personal contact, e.g. email, whatsapp, or phone call.

Data

Dataset of this project can be downloaded at: https://www.kaggle.com/anmolkumar/health-insurance-cross-sell-prediction

The dataset contains personal data of the customers of health insurance company. The marketing team would like to know if their customer also interested with their vehicle insurance product.

Data Description

id :	Unique ID for the customer
Gender :	Gender of the customer
Age :	Age of the customer
Driving_License	0 : Customer does not have DL, 1 : Customer already has DL
Region_Code :	Unique code for the region of the customer
Previously_Insured : 1 : Customer already has Vehicle Insurance, 0 : Customer doesn't have Vehicle Insurance
Vehicle_Age :	Age of the Vehicle
Vehicle_Damage : 1 : Customer got his/her vehicle damaged in the past. 0 : Customer didn't get his/her vehicle damaged in the past.
Annual_Premium : The amount customer needs to pay as premium in the year
PolicySalesChannel : Anonymized Code for the channel of outreaching to the customer ie. Different Agents, Over Mail, Over Phone, In Person, etc.
Vintage : Number of Days, Customer has been associated with the company
Response : 1 : Customer is interested, 0 : Customer is not interested

Process

The workflow of this project can be separated into five diferent stages:

Exploratory Data Analysis (EDA): : Finds the hiddend pattern in the data and define the modelling strategy. See Exploratory Data Analysis.ipynb
Modelling : Applies modelling strategy to find the best combination of pre-processing techniques and ML model. See Modelling.ipynb
Evaluation : Evaluate the combination of pre-processing techniques and ML model by the most suitable metrics. See Modelling.ipynb
Interpretation : Interprets the model back into the business problem, see how feature relates to the model output. In this section I use an interesting technique to visualize the model output, called SHAP. See Model Benchmarking and Business Interpretation.ipynb
Deployment : Deploys the model into the production. See Deployment Preparation.ipynb and run the app to see how the model works in the real-life scenario by running,

python app.py

in your terminal, and open localhost:5000 in your personal machine.

Example of the SHAP plot:

Demo

To see how the model works:

Click the Know Your Customer !!! tab.
Input your data.
Click the Predict button.
You'll see the result appear in your screen.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
archive		archive
static		static
templates		templates
Deployment Preparation.ipynb		Deployment Preparation.ipynb
Exploratory Data Analysis.ipynb		Exploratory Data Analysis.ipynb
Model Benchmarking and Business Interpretation.ipynb		Model Benchmarking and Business Interpretation.ipynb
Modelling.ipynb		Modelling.ipynb
README.md		README.md
app.py		app.py
estimator_deploy.sav		estimator_deploy.sav
estimator_lgbm.sav		estimator_lgbm.sav
estimator_lgbm_iso.sav		estimator_lgbm_iso.sav
estimator_lgbm_sig.sav		estimator_lgbm_sig.sav
estimator_lr.sav		estimator_lr.sav
estimator_lr_iso.sav		estimator_lr_iso.sav
estimator_lr_sig.sav		estimator_lr_sig.sav
lgbm_final.sav		lgbm_final.sav
process.png		process.png
result.png		result.png
shap.png		shap.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Insurance Cross-Selling Prediction

Background

How Could Data Science and Machine Learning Contribute in This Kind of Situation?

Data

Data Description

Process

Demo

About

Releases

Packages

Languages

dioz95/final_project_purwadhika_jcdsah

Folders and files

Latest commit

History

Repository files navigation

Insurance Cross-Selling Prediction

Background

How Could Data Science and Machine Learning Contribute in This Kind of Situation?

Data

Data Description

Process

Demo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages