Guide to Serve Machine Learning Models in Production

Here’s a simple checklist for people who deploy machine learning models to production. Based on my personal experience.

Identify Bias: Will the bias affect users? Discuss with the product team.
Performance Profile: List your hardware requirements: CPU/GPU/RAM/DISK, sustaining usage or burst?
Model file size: Avoid network congestion. Decide on push-based or pull-based deployment.
Continuous testing: Test the model on real-world data. Check for over-fitting.

Log Collection: collect log for debugging, remove personal information, sample if needed.
Error handling: No model is perfect. Respond to user feedback, fast.
Emergency response: Be prepare to fix the application with heuristic rules.
Observability: Detect concept drift in the real world asap.