MLAAS: Machine Learning API Backend
REST API backend for serving machine learning models as a service — wrapping scikit-learn and PyTorch models behind versioned HTTP endpoints.
MLAAS (Machine Learning as a Service) is the API backend for serving trained ML models over HTTP. Where the Streamlit deployment handles the interactive browser interface, this repo provides the underlying REST layer — model loading, versioning, and inference endpoints that other clients can call programmatically.
What It Does
The server wraps scikit-learn and PyTorch model artifacts behind versioned endpoints:
POST /predict— run inference on a submitted feature payloadGET /model/info— return model metadata (version, trained-on, input schema)GET /health— used by the load balancer for readiness probes
Models are loaded at startup from a model registry directory and cached in memory. Swapping a model version requires updating the registry path and restarting — no hot-reload, which is intentional for reproducibility in regulated contexts.
Why a Separate API Layer
Streamlit handles the UX but is not suitable for production-grade programmatic access. The API layer decouples model serving from presentation: data pipelines, batch jobs, and external systems can call the inference endpoint without going through a browser. This separation also made it easier to deploy the model server on a different instance from the Streamlit front end.
Stack
Python + Flask, model artifacts serialized as pickle (sklearn) or .pt files (PyTorch). Docker-based deployment with a single docker-compose.yml that wires together the API server and a Redis cache for result memoization on repeated identical inputs.
I write about this kind of work — reliability, uncertainty, building things that work in production. One email per month.