Segmenting mall customers based on their Annual Income and Spending Score using the K-Means clustering algorithm.
To perform unsupervised customer segmentation to help businesses understand different customer types and enable targeted marketing strategies.
- Source: Mall_Customers.csv
- Features:
CustomerID
Gender
Age
Annual Income (k$)
Spending Score (1-100)
- Python π
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Scikit-learn (KMeans Clustering)
-
Data Loading and Cleaning
- Loaded dataset using
pandas
- Checked for missing values
- Loaded dataset using
-
Feature Selection
- Used only
Annual Income
andSpending Score
for clustering
- Used only
-
Elbow Method for Optimal Clusters
- Calculated WCSS for K=1 to 10
- Plotted Elbow graph to find best K (optimal at K=5)
-
K-Means Clustering
- Applied
KMeans(n_clusters=5)
- Clustered customers into 5 segments
- Applied
-
Visualization
- Plotted each cluster with distinct color
- Marked centroids of clusters
Customer Clusters | Elbow Method |
---|---|
![]() |
![]() |
- Cluster 1: High income, low spenders
- Cluster 2: Low income, high spenders
- Cluster 3: Average income, average spenders
- ...
- Business can use this for targeted advertising, loyalty programs, and personalized offers
git clone https://github.com/yourusername/customer-segmentation-kmeans.git
cd customer-segmentation-kmeans
pip install -r requirements.txt
jupyter notebook