Customer Churn Prediction with ANN (Classification)

This project developed a binary classification model to predict bank customer churn using an Artificial Neural Network (ANN). I built a deep learning solution to identify customers likely to leave the bank based on their demographic and account information, enabling proactive retention strategies.

๐Ÿ’ป Tech Stack:

๐Ÿงช Data Pipeline:

๐Ÿ“Š Code Snippets & Visualisations:

# Importing the libraries
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.preprocessing import LabelEncoder, StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, accuracy_score

# Import dataset
dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:-1].values
y = dataset.iloc[:, -1].values

print("Features (X):") # Table 1
print(X)
print("\nTarget variable (y):") # Table 2
print(y)

# Encoding categorical data (encoding gender column) (Table 3)
le = LabelEncoder()
X[:, 2] = le.fit_transform(X[:, 2])

print("\nAfter Label Encoding Gender:") # Table 3
print(X)

# Encoding categorical data (One Hot encoding geography column) (Table 4)
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [1])], remainder='passthrough')
X = np.array(ct.fit_transform(X))

print("\nAfter One-Hot Encoding Geography:") # Table 4
print(X)

# Splitting the dataset into Training and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

# Feature Scaling
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Building ANN
ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

# Compiling the ANN
ann.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Training ANN (Table 5)
history = ann.fit(X_train, y_train, batch_size=32, epochs=100)

# Display training results
print("\nTraining completed. Final accuracy:", history.history['accuracy'][-1])

# Predicting results of a single observation
# Note: The input should match the preprocessing (one-hot encoded geography + other features)
sample_prediction = ann.predict(sc.transform([[1, 0, 0, 600, 1, 40, 3, 60000, 2, 1, 1, 50000]]))
print("\nSingle prediction probability:", sample_prediction[0][0])
print("Single prediction (>0.5):", sample_prediction > 0.5)

# Predicting Test set results (Table 6)
y_pred = ann.predict(X_test)
y_pred_binary = (y_pred > 0.5)

print("\nPredictions vs Actual (first 20 samples):") # Table 6
comparison = np.concatenate((y_pred_binary.reshape(len(y_pred_binary), 1), 
                           y_test.reshape(len(y_test), 1)), 1)
print("Predicted | Actual")
print(comparison[:20])

# Making the Confusion Matrix (Table 7)
cm = confusion_matrix(y_test, y_pred_binary)
accuracy = accuracy_score(y_test, y_pred_binary)

print("\nConfusion Matrix:") # Table 7
print(cm)
print(f"\nAccuracy Score: {accuracy:.4f}")

# Additional metrics for better evaluation
from sklearn.metrics import classification_report
print("\nClassification Report:")
print(classification_report(y_test, y_pred_binary))

# Model summary
print("\nModel Architecture:")
ann.summary()

						

๐ŸŒŸ Key Insights:

๐Ÿง—๐Ÿพ Challenge Faced:

The main challenge was handling mixed categorical and numerical data types efficiently. Initially, I struggled with applying different encoding methods to different columns simultaneously. After experimenting with various approaches, I discovered ColumnTransformer, which allowed me to apply One-Hot Encoding to geography while preserving other numerical features, streamlining the preprocessing pipeline significantly.

View on GitHub

โ† Back to Projects