Customer Churn Prediction with ANN (Classification)

This project developed a binary classification model to predict bank customer churn using an Artificial Neural Network (ANN). I built a deep learning solution to identify customers likely to leave the bank based on their demographic and account information, enabling proactive retention strategies.

💻 Tech Stack:

Python for machine learning model development and comparison

Scikit-learn for preprocessing, encoding, scaling, and evaluation metrics

Pandas for dataset loading and initial data exploration

NumPy for numerical operations and grid generation

TensorFlow/Keras for building and training the neural network

🧪 Data Pipeline:

Load & inspect data: Loaded customer banking dataset with demographic and account features, selected relevant features (columns [:, 3:-1]) as input variables and extracted churn status as target binary variableloc[].

Data preprocessing Analysis: Applied Label Encoding to convert Gender column to numerical format, Implemented One-Hot Encoding for Geography column to handle multiple categories and used ColumnTransformer to apply different encodings to specific columns

Model Architecture: Built Sequential ANN with three layers: two hidden layers (6 units each, ReLU activation) and output layer (1 unit, sigmoid activation), compiled with Adam optimizer and binary crossentropy loss for binary classification and trained for 100 epochs with batch size of 32

Model Evaluation: Generated predictions on test set with 0.5 probability threshold, created confusion matrix to analyze true/false positives and negatives and calculated accuracy score for overall model performance assessment

📊 Code Snippets & Visualisations:

# Importing the libraries import numpy as np import pandas as pd import tensorflow as tf from sklearn.preprocessing import LabelEncoder, StandardScaler, OneHotEncoder from sklearn.compose import ColumnTransformer from sklearn.model_selection import train_test_split from sklearn.metrics import confusion_matrix, accuracy_score # Import dataset dataset = pd.read_csv('Churn_Modelling.csv') X = dataset.iloc[:, 3:-1].values y = dataset.iloc[:, -1].values print("Features (X):") # Table 1 print(X) print("\nTarget variable (y):") # Table 2 print(y) # Encoding categorical data (encoding gender column) (Table 3) le = LabelEncoder() X[:, 2] = le.fit_transform(X[:, 2]) print("\nAfter Label Encoding Gender:") # Table 3 print(X) # Encoding categorical data (One Hot encoding geography column) (Table 4) ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough') X = np.array(ct.fit_transform(X)) print("\nAfter One-Hot Encoding Geography:") # Table 4 print(X) # Splitting the dataset into Training and test set X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0) # Feature Scaling sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test) # Building ANN ann = tf.keras.models.Sequential() ann.add(tf.keras.layers.Dense(units=6, activation='relu')) ann.add(tf.keras.layers.Dense(units=6, activation='relu')) ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid')) # Compiling the ANN ann.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # Training ANN (Table 5) history = ann.fit(X_train, y_train, batch_size=32, epochs=100) # Display training results print("\nTraining completed. Final accuracy:", history.history['accuracy'][-1]) # Predicting results of a single observation # Note: The input should match the preprocessing (one-hot encoded geography + other features) sample_prediction = ann.predict(sc.transform([[1, 0, 0, 600, 1, 40, 3, 60000, 2, 1, 1, 50000]])) print("\nSingle prediction probability:", sample_prediction[0][0]) print("Single prediction (>0.5):", sample_prediction > 0.5) # Predicting Test set results (Table 6) y_pred = ann.predict(X_test) y_pred_binary = (y_pred > 0.5) print("\nPredictions vs Actual (first 20 samples):") # Table 6 comparison = np.concatenate((y_pred_binary.reshape(len(y_pred_binary), 1), y_test.reshape(len(y_test), 1)), 1) print("Predicted | Actual") print(comparison[:20]) # Making the Confusion Matrix (Table 7) cm = confusion_matrix(y_test, y_pred_binary) accuracy = accuracy_score(y_test, y_pred_binary) print("\nConfusion Matrix:") # Table 7 print(cm) print(f"\nAccuracy Score: {accuracy:.4f}") # Additional metrics for better evaluation from sklearn.metrics import classification_report print("\nClassification Report:") print(classification_report(y_test, y_pred_binary)) # Model summary print("\nModel Architecture:") ann.summary()

Table 1: Variable X — **Table 1** Variable X

Table 2: Variable y — **Table 2** Variable y

Table 3: Encoding Gender — **Table 3** Encoding Gender

Table 4: Encoding Country — **Table 4** Encoding Country

Table 5: Training the ANN — **Table 5** Training the ANN

Table 6: Predicting Test Set Results — **Table 6** Predicting Test Set Results

Table 7: Confusion Matrix — **Table 7** Confusion Matrix

🌟 Key Insights:

Neural networks effectively capture non-linear relationships between customer demographics and churn behavior, outperforming traditional linear models for this complex classification task

Feature engineering with proper encoding techniques significantly improved model performance, particularly the one-hot encoding of geography which revealed location-based churn patterns

Confusion matrix analysis revealed the trade-offs between sensitivity (detecting cancer) and specificity (avoiding false alarms)

Standardization was crucial for ANN convergence, as the varied scales of financial features (account balance, salary) required normalization to prevent training instability

🧗🏾 Challenge Faced:

The main challenge was handling mixed categorical and numerical data types efficiently. Initially, I struggled with applying different encoding methods to different columns simultaneously. After experimenting with various approaches, I discovered ColumnTransformer, which allowed me to apply One-Hot Encoding to geography while preserving other numerical features, streamlining the preprocessing pipeline significantly.

View on GitHub

Customer Churn Prediction with ANN (Classification)

💻 Tech Stack:

🧪 Data Pipeline:

📊 Code Snippets & Visualisations:

🌟 Key Insights:

🧗🏾 Challenge Faced: