A machine learning project to replicate a market segmentation case study on McDonald's customers using clustering techniques, powered by Python and scikit-learn.
- 💬 Customer segmentation using unsupervised learning (KMeans)
- 📊 Feature preprocessing (scaling, encoding)
- 🔍 Elbow method to determine optimal number of clusters
- 🧬 PCA-based cluster visualization
- 📄 Modular and scalable codebase structured like an ML pipeline
- 🚀 Ready for deployment and GitHub publication
| Technology | Purpose |
|---|---|
| Python | Core programming language |
| pandas | Data manipulation and I/O |
| scikit-learn | Clustering and preprocessing |
| matplotlib | Visualizations |
| Flask (optional) | Web deployment interface |
| Docker (optional) | Containerization |
| GitHub | Version control and collaboration |
┌────────────────────────┐ ┌────────────────────────┐
│ Dataset (CSV File) │ │ main.py │
└────────────┬───────────┘ └────────┬───────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────────┐
│ Load & Preprocess│◀──────▶│ src/ modules │
└────────┬─────────┘ └────────┬─────────────┘
▼ ▼
┌──────────────────┐ ┌──────────────────────┐
│ Clustering Logic │ │ Visualization │
└────────┬─────────┘ └────────┬─────────────┘
▼ ▼
Outputs: Cluster labels, PCA plots, Elbow chart
fastfood_segmentation/
├── data/ # Dataset CSV
├── models/ # (Optional) Saved models
├── outputs/ # Plots and analysis results
├── src/
│ ├── data/ # Data loading
│ ├── preprocessing/ # Data cleaning, encoding, scaling
│ ├── model/ # Clustering (KMeans)
│ └── visualization/ # Elbow & cluster plots
├── main.py # Pipeline entry point
├── requirements.txt # Python dependencies
└── README.md # Project overview
- Python 3.10+
git clone https://github.com/your-username/mcdonalds-segmentation.git
cd mcdonalds-segmentation
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txtRun the full pipeline:
python main.pyExpected outputs:
outputs/elbow_plot.pngoutputs/cluster_plot.png
| Column | Type | Description |
|---|---|---|
| yummy, cheap... | Categorical | Yes/No features about food perception |
| Like | Integer | Rating from -3 to +3 |
| Age | Integer | Customer age |
| VisitFrequency | Categorical | Visit frequency (encoded) |
| Gender | Categorical | Gender (encoded) |
Below are examples of the plots generated by the analysis:
Elbow Plot for Optimal Clusters:

PCA Cluster Plot for Customer Visualization (Example with K=3):

If tests are added:
python -m pytest tests/MIT License
- Book: Market Segmentation Analysis
- scikit-learn open source community
- Inspiration from McDonald's case study
For questions or suggestions, open an issue or contact: Vishal Gorule – [gorulevishal984@gmail.com] – Vision Expo