30 Days of MLOps Challenge · Day 11

Packaging Models with Docker – Containerize & Deploy Your ML Models

By Aviraj Kawade · July 7, 2025 · 6 min read

Bundle your model, code, and runtime into a portable container image to ensure consistent, reproducible deployments across dev, staging, and prod.

← Previous: Day 10 Back to Roadmap Next: Day 12 →

💡 Hey — It's Aviraj Kawade 👋

Key Learnings

Why containerization matters for reproducible and scalable deployment.
Docker basics for packaging ML models.
Write Dockerfiles for Flask/FastAPI/TensorFlow Serving.

Learn more

Why Containerization for ML?

Reproducibility: Encapsulate exact environment; avoid “works on my machine”.
Portability: Run anywhere with a container runtime.
Dependency Mgmt: Bundle libs/tools; avoid conflicts.
Scalability: Works with K8s for autoscaling and rollouts.
Security & Isolation: Minimal images, resource limits, sandboxing.

Common Tools

Docker, Podman
Kubernetes, Kubeflow
MLflow + Docker for lifecycle integration

Docker Basics

Key Concept	Description
Image	Snapshot of app + environment.
Container	Running instance of an image.
Dockerfile	Build instructions for the image.
Multi‑Stage	Optimize size via build/runtime stages.
ENTRYPOINT/CMD	Default process to run.
Ports	Expose service endpoints.
.dockerignore	Exclude files from build context.
Tags	Version images (e.g., v1.0).

Sample Project Layout

ml-model-docker/
├── Dockerfile
├── app.py                # Flask/FastAPI app to serve the model
├── model.pkl             # Trained ML model
├── requirements.txt      # Python dependencies
└── README.md

Sample Dockerfile

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "app.py"]

Build & Run

docker build -t ml-model-api .
docker run -p 5000:5000 ml-model-api

Dockerizing Example Apps

Flask

# flask_model/model_build.py
import pickle
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
model = LogisticRegression()
model.fit(X, y)

with open("model.pkl", "wb") as f:
    pickle.dump(model, f)

# flask_model/app.py
import pickle
from flask import Flask, request, jsonify

app = Flask(__name__)
model = pickle.load(open("model.pkl", "rb"))

@app.route("/predict", methods=["POST"])
def predict():
    data = request.json["input"]
    prediction = model.predict([data])
    return jsonify({"prediction": prediction.tolist()})

# flask_model/Dockerfile
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install flask scikit-learn
CMD ["python", "app.py"]

FastAPI

# fastapi_model/model_build.py
import pickle
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)
model = RandomForestClassifier()
model.fit(X, y)

with open("model.pkl", "wb") as f:
    pickle.dump(model, f)

# fastapi_model/main.py
import pickle
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List

app = FastAPI()
model = pickle.load(open("model.pkl", "rb"))

class InputData(BaseModel):
    input: List[float]

@app.post("/predict")
def predict(data: InputData):
    prediction = model.predict([data.input])
    return {"prediction": prediction.tolist()}

# fastapi_model/Dockerfile
FROM python:3.9
WORKDIR /app
COPY . .
RUN pip install fastapi uvicorn scikit-learn pydantic
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

TensorFlow Serving

# tf_serving_model/model_build.py
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical

X, y = load_iris(return_X_y=True)
y = to_categorical(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

model = Sequential([
    Dense(10, activation='relu', input_shape=(4,)),
    Dense(3, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)
model.save("saved_model/iris_model")

# tf_serving_model/Dockerfile
FROM tensorflow/serving:2.14.0
RUN mkdir -p /models/iris_model
COPY saved_model/iris_model /models/iris_model/1
ENV MODEL_NAME=iris_model
ENV MODEL_BASE_PATH=/models
EXPOSE 8501 8500
CMD ["/usr/bin/tensorflow_model_server", "--rest_api_port=8501", "--model_name=iris_model", "--model_base_path=/models/iris_model"]

Build/Run & Test

cd flask_model
python model_build.py
docker build -t flask-ml-model .
docker run -p 5000:5000 flask-ml-model

cd fastapi_model
python model_build.py
docker build -t fastapi-ml-model .
docker run -p 8000:8000 fastapi-ml-model

cd tf_serving_model
python model_build.py
docker build -t tf-serving-iris .
docker run -p 8501:8501 tf-serving-iris

curl -X POST http://localhost:8501/v1/models/iris_model:predict \
  -H "Content-Type: application/json" \
  -d '{"instances": [[5.1, 3.5, 1.4, 0.2]]}'

Best Practices

Use slim/alpine base images where possible.
Add .dockerignore to shrink build context.
Keep images minimal; avoid unused packages.
Use environment variables for config and secrets.
Consider multi‑stage builds for smaller outputs.

Challenges

Create a Dockerfile to containerize the app and expose an API.
Build and run the container; make a POST request with sample input.
Test from another terminal (e.g., curl POST to /predict).
Add .dockerignore to exclude unnecessary files.
Use a multi‑stage Dockerfile to optimize image size.
Write a README with instructions and sample requests.

← Back to MLOps Roadmap