🎬 TMDB Multi-Label Movie Genre Classifier
Serverless Machine Learning Pipeline — TF-IDF + Linear SVC — Fully Automated & Deployed
Summary
This project demonstrates the ability to design, automate, and deploy a real-world Machine Learning system without relying on paid cloud services.
It showcases strong understanding and application of:
- MLOps & CI/CD
- Automated retraining & scheduled jobs
- Model deployment & UI interface
- Testing, documentation, reproducibility
The model predicts multiple genres for a movie based on its description — similar to how streaming platforms tag content for recommendations.
➡ Live Demo: https://huggingface.co/spaces/arjun-varma/tmdb-genre-classifier
➡ Model Hub: https://huggingface.co/arjun-varma/tmdb-genre-classifier
🧠 Problem — Why Multi‑Label Classification?
Movies are not mutually exclusive:
| Plot Summary | Correct Genres |
|---|---|
| Soldier returns from war, struggling with trauma | Drama, War |
| AI becomes sentient and turns against creators | Sci‑Fi, Thriller |
| A musician finds love on tour | Music, Romance |
Single‑label classifiers fail here.
Multi‑label learning predicts all genres that simultaneously apply.
This creates challenges:
- Soft labels
- Ambiguity
- Genre co‑occurrence patterns
- Long‑tail imbalance (Documentary vs Thriller vs Music)
🧱 Architecture — Serverless ML Pipeline
No AWS SageMaker, no GCP Vertex AI.
Infrastructure cost = $0
⚙️ Model — Why TF‑IDF + Linear SVC vs Transformers?
| Choice | Reason |
|---|---|
| Transformers | Expensive & slow for nightly retraining |
| Neural Networks | Need GPUs / infra |
| Logistic Regression | High precision, low recall |
| Linear SVC + TF‑IDF | Fast, scalable, interpretable 👈 Best for pipeline |
The biggest improvement:
- Logistic Regression predicted almost nothing → trying to be “safe”
- Linear SVC learned boundary margins → better multi‑genre recall
- Applying sigmoid + threshold → configurable precision/recall trade‑off
📊 Performance Metrics
| Model | Precision_micro | Recall_micro | F1_macro | Result |
|---|---|---|---|---|
| Logistic Regression | 0.83 | 0.006 | ~0.03 | Almost no predictions |
| Linear SVC + threshold 0.25 | 0.16 | 0.99 | 0.27 | Usable predictions |
Interpretation:
- High recall = the model "understands" the genres
- Threshold lets different applications choose correctness level
If this was powering recommendations, threshold matters.
🧪 Testing
This project includes:
- Unit tests for vectorization & data transformation
- Mocked API tests for dataset ingestion
- End‑to‑end pipeline test verifying artifacts & metrics
Tools used:
pytestmonkeypatchtmp_path- GitHub CI
This demonstrates reliability in automation-focused ML environments.
🖥 Demo & Integration
The model provides:
- ⭐ Ranked genre probabilities
- ⭐ Adjustable confidence threshold
- ⭐ Real‑time inference
🚀 Future Enhancements
| Idea | Value |
|---|---|
| Compare vs MiniLM Transformer | Benchmark credibility |
| Add FastAPI inference service | Deployable microservice |
| Visualize confidence & confusion | Explainable AI |
✍ Author
Arjun Varma
Machine Learning Engineer & Systems Developer
Designed for real-world ML infrastructure readiness.
- Downloads last month
- -
