Pedram Agand
← Projects
Project

MLAAS: Machine Learning API Backend

mlopsapipythonscikit-learndeploymentrest-api

REST API backend for serving machine learning models as a service — wrapping scikit-learn and PyTorch models behind versioned HTTP endpoints.

MLAAS (Machine Learning as a Service) is the API backend for serving trained ML models over HTTP. Where the Streamlit deployment handles the interactive browser interface, this repo provides the underlying REST layer — model loading, versioning, and inference endpoints that other clients can call programmatically.

What It Does

The server wraps scikit-learn and PyTorch model artifacts behind versioned endpoints:

  • POST /predict — run inference on a submitted feature payload
  • GET /model/info — return model metadata (version, trained-on, input schema)
  • GET /health — used by the load balancer for readiness probes

Models are loaded at startup from a model registry directory and cached in memory. Swapping a model version requires updating the registry path and restarting — no hot-reload, which is intentional for reproducibility in regulated contexts.

Why a Separate API Layer

Streamlit handles the UX but is not suitable for production-grade programmatic access. The API layer decouples model serving from presentation: data pipelines, batch jobs, and external systems can call the inference endpoint without going through a browser. This separation also made it easier to deploy the model server on a different instance from the Streamlit front end.

Stack

Python + Flask, model artifacts serialized as pickle (sklearn) or .pt files (PyTorch). Docker-based deployment with a single docker-compose.yml that wires together the API server and a Redis cache for result memoization on repeated identical inputs.

I write about this kind of work — reliability, uncertainty, building things that work in production. One email per month.