Project

MLAAS: Machine Learning API Backend

mlopsapipythonscikit-learndeploymentrest-api

REST API backend for serving machine learning models as a service - wrapping scikit-learn and PyTorch models behind versioned HTTP endpoints.

MLAAS (Machine Learning as a Service) is the API backend for serving trained ML models over HTTP. Where the Streamlit deployment handles the interactive browser interface, this repo provides the underlying REST layer - model loading, versioning, and inference endpoints that other clients can call programmatically.

What It Does

The server wraps scikit-learn and PyTorch model artifacts behind versioned endpoints:

POST /predict - run inference on a submitted feature payload
GET /model/info - return model metadata (version, trained-on, input schema)
GET /health - used by the load balancer for readiness probes

Models are loaded at startup from a model registry directory and cached in memory. Swapping a model version requires updating the registry path and restarting - no hot-reload, which is intentional for reproducibility in regulated contexts.

Why a Separate API Layer

Streamlit handles the UX but is not suitable for production-grade programmatic access. The API layer decouples model serving from presentation: data pipelines, batch jobs, and external systems can call the inference endpoint without going through a browser. This separation also made it easier to deploy the model server on a different instance from the Streamlit front end.

Stack

Python + Flask, model artifacts serialized as pickle (sklearn) or .pt files (PyTorch). Docker-based deployment with a single docker-compose.yml that wires together the API server and a Redis cache for result memoization on repeated identical inputs.

I write about this kind of work - reliability, uncertainty, building things that work in production. One email per month.