30 Days of MLOps Challenge · Day 11

Docker iconPackaging Models with Docker – Containerize & Deploy Your ML Models

By Aviraj Kawade · July 7, 2025 · 6 min read

Bundle your model, code, and runtime into a portable container image to ensure consistent, reproducible deployments across dev, staging, and prod.

💡 Hey — It's Aviraj Kawade 👋

Key Learnings

  • Why containerization matters for reproducible and scalable deployment.
  • Docker basics for packaging ML models.
  • Write Dockerfiles for Flask/FastAPI/TensorFlow Serving.

Learn more

Containerize ML models overview

Why Containerization for ML?

  • Reproducibility: Encapsulate exact environment; avoid “works on my machine”.
  • Portability: Run anywhere with a container runtime.
  • Dependency Mgmt: Bundle libs/tools; avoid conflicts.
  • Scalability: Works with K8s for autoscaling and rollouts.
  • Security & Isolation: Minimal images, resource limits, sandboxing.

Common Tools

  • Docker, Podman
  • Kubernetes, Kubeflow
  • MLflow + Docker for lifecycle integration

Docker Basics

Key ConceptDescription
ImageSnapshot of app + environment.
ContainerRunning instance of an image.
DockerfileBuild instructions for the image.
Multi‑StageOptimize size via build/runtime stages.
ENTRYPOINT/CMDDefault process to run.
PortsExpose service endpoints.
.dockerignoreExclude files from build context.
TagsVersion images (e.g., v1.0).

Sample Project Layout

ml-model-docker/
├── Dockerfile
├── app.py                # Flask/FastAPI app to serve the model
├── model.pkl             # Trained ML model
├── requirements.txt      # Python dependencies
└── README.md

Sample Dockerfile

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]

Build & Run

docker build -t ml-model-api .
docker run -p 5000:5000 ml-model-api

Dockerizing Example Apps

Flask

# flask_model/model_build.py
import pickle
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
model = LogisticRegression()
model.fit(X, y)

with open("model.pkl", "wb") as f:
    pickle.dump(model, f)
# flask_model/app.py
import pickle
from flask import Flask, request, jsonify

app = Flask(__name__)
model = pickle.load(open("model.pkl", "rb"))

@app.route("/predict", methods=["POST"])
def predict():
    data = request.json["input"]
    prediction = model.predict([data])
    return jsonify({"prediction": prediction.tolist()})
# flask_model/Dockerfile
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install flask scikit-learn
CMD ["python", "app.py"]

FastAPI

# fastapi_model/model_build.py
import pickle
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
model = RandomForestClassifier()
model.fit(X, y)

with open("model.pkl", "wb") as f:
    pickle.dump(model, f)
# fastapi_model/main.py
import pickle
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List

app = FastAPI()
model = pickle.load(open("model.pkl", "rb"))

class InputData(BaseModel):
    input: List[float]

@app.post("/predict")
def predict(data: InputData):
    prediction = model.predict([data.input])
    return {"prediction": prediction.tolist()}
# fastapi_model/Dockerfile
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install fastapi uvicorn scikit-learn pydantic
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

TensorFlow Serving

# tf_serving_model/model_build.py
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical

X, y = load_iris(return_X_y=True)
y = to_categorical(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = Sequential([
    Dense(10, activation='relu', input_shape=(4,)),
    Dense(3, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)
model.save("saved_model/iris_model")
# tf_serving_model/Dockerfile
FROM tensorflow/serving:2.14.0
RUN mkdir -p /models/iris_model
COPY saved_model/iris_model /models/iris_model/1
ENV MODEL_NAME=iris_model
ENV MODEL_BASE_PATH=/models
EXPOSE 8501 8500
CMD ["/usr/bin/tensorflow_model_server", "--rest_api_port=8501", "--model_name=iris_model", "--model_base_path=/models/iris_model"]

Build/Run & Test

cd flask_model
python model_build.py
docker build -t flask-ml-model .
docker run -p 5000:5000 flask-ml-model
cd fastapi_model
python model_build.py
docker build -t fastapi-ml-model .
docker run -p 8000:8000 fastapi-ml-model
cd tf_serving_model
python model_build.py
docker build -t tf-serving-iris .
docker run -p 8501:8501 tf-serving-iris
curl -X POST http://localhost:8501/v1/models/iris_model:predict \
  -H "Content-Type: application/json" \
  -d '{"instances": [[5.1, 3.5, 1.4, 0.2]]}'

Best Practices

  • Use slim/alpine base images where possible.
  • Add .dockerignore to shrink build context.
  • Keep images minimal; avoid unused packages.
  • Use environment variables for config and secrets.
  • Consider multi‑stage builds for smaller outputs.

Challenges

  • Create a Dockerfile to containerize the app and expose an API.
  • Build and run the container; make a POST request with sample input.
  • Test from another terminal (e.g., curl POST to /predict).
  • Add .dockerignore to exclude unnecessary files.
  • Use a multi‑stage Dockerfile to optimize image size.
  • Write a README with instructions and sample requests.
← Back to MLOps Roadmap