Predicting customer churn at a wireless telecom company
There are 71,047 customers in the database and 75 potential predictors. The Dataset combines the calibration and validation customers. 'Calibration' dataset consisting of 40,000 customers and 'Validation' dataset consisting of 31,047 customers.
The objective of this project is to develop a model for predicting customer churn at a wireless telecom company, and use insights from the model to develop an incentive plan for enticing would-be churners to remain with the comapny.
- Explanatory Data analysis
- Data cleaning/preparation with missing value treatment and outlier treatment.
- Variable reduction techniques: (i) Performed Chi square test to select the significant categorical variables. (ii) Performed Stepwise linear regression on the entire dataset with the previously selected categorical variables along with all the continuous variables to further reduce the variables for final model building.
- Built the final glm with all the significant variables.
- Applied VIF function to check the Multicollinearity.
- Applied Concordance function to calculate the concordance rate, discordance rate, somers_D and Gamma.
- Predicted the response on calibration and validation datsets and created the Confusion matrix to see the accuracy and mis- classification rate.
- Performed Decile analysis on both datasets and obtained the KS-score.
- From KS-score, obtained the cutoff probability for correctly classifying the churnes and non-churners.
Excel file(case study 3 - Logistic regression) contains descriptive stats for all the variables, Final model summary, Decile analysis along with all the graphs(Gains chart,Lift chart,Comparision between calibration and validation churn rate).