30 Days of MLOps Challenge · Day 4

Docker iconReproducible ML environments using Conda & Docker

By Aviraj Kawade · June 14, 2025 · 5 min read

Use Conda for package‑level reproducibility and Docker for system‑level consistency to eliminate “works on my machine” problems.

💡 Hey — It's Aviraj Kawade 👋 🧠 New to DevOps? Start 60 Days of DevOps

Key Learnings

  • Why environment reproducibility matters in ML.
  • Conda for managing Python environments and dependencies.
  • Docker for portable, consistent runtime environments.
  • How Conda and Docker complement each other.

Environment Reproducibility in ML

Recreate the exact same environment—software versions, dependencies, and system libraries—so models behave identically across machines and time.

  • Consistent results across train/test/prod
  • Reliable experimentation and comparisons
  • Team collaboration without setup issues
  • Fewer "works on my machine" bugs
  • Streamlined CI/CD and debugging

Common tools: Conda/Virtualenv, Docker, pip + requirements.txt, MLflow/DVC.

Conda for Managing Python Environments

Conda isolates per‑project dependencies, supports non‑Python packages and CUDA stacks, and enables easy export/import of entire environments.

Install Conda

Linux/macOS (x86)

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# macOS ARM
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh

Windows (PowerShell)

wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe" -outfile ".\Downloads\Miniconda3-latest-Windows-x86_64.exe"

Core workflow

# Create env
conda create -n ml-env python=3.10 numpy pandas scikit-learn jupyter
# Activate
conda activate ml-env
# Install deps
conda install matplotlib seaborn jupyterlab
conda install -c conda-forge xgboost
# Deep learning
conda install -c pytorch pytorch torchvision torchaudio
conda install -c conda-forge tensorflow
# Jupyter kernel
pip install ipykernel
python -m ipykernel install --user --name ml-env --display-name "Python (ml-env)"
# Export / Re-create
envname=ml-env
conda env export -n %envname% > environment.yml
conda env create -f environment.yml
# Remove
conda remove -n ml-env --all

Example Project Structure

my-ml-project/
├── data/
├── notebooks/
├── src/
├── environment.yml
└── README.md

Best Practices

  • Commit environment.yml to Git.
  • Prefer conda-forge channel for ML stacks.
  • Use conda-lock or Docker for tighter reproducibility.
  • Install pip packages last if needed.

Docker for Portable ML Environments

Why Docker?

  • Portability: package code + environment as a single image.
  • Consistency: immutable builds prevent drift.
  • Reproducibility: identical across dev/test/prod.

1) Dockerfile

# Base image with Python
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# System deps
RUN apt-get update && apt-get install -y \
    build-essential \
    git \
    wget \
    && rm -rf /var/lib/apt/lists/*

# Copy project
COPY . .

# Python deps
RUN pip install --no-cache-dir -r requirements.txt

# Default command
CMD ["python", "train.py"]

2) requirements.txt

numpy
pandas
scikit-learn
matplotlib
jupyterlab
tensorflow

3) Build & Run

# Build
docker build -t ml-env:latest .
# Interactive dev
docker run -it --rm -v $(pwd):/app ml-env:latest
# Jupyter Lab
docker run -it -p 8888:8888 -v $(pwd):/app ml-env:latest jupyter lab --ip=0.0.0.0 --allow-root

4) .dockerignore

__pycache__/
*.pyc
.env
data/
models/

5) docker-compose (optional)

version: '3'
services:
  ml:
    build: .
    volumes:
      - .:/app
    ports:
      - "8888:8888"
  mongo:
    image: mongo:latest
    ports:
      - "27017:27017"

Pro Tips

  • Pin versions in requirements.txt
  • Use lightweight base images
  • Mount data via volumes; keep images lean
  • Use env vars for secrets/config

Conda vs Docker

FeatureCondaDocker
ScopePython/R envs and packagesFull OS‑level env and deps
SpeedFast local setupSlower to build images
IsolationLanguage/package levelSystem‑level isolation
PortabilityOS‑dependent quirksHighly portable
Best useNotebooks & prototypingDeployment & CI/CD

Docker + Conda Together

Combine Conda for dependency management with Docker for system reproducibility.

Dockerfile

FROM continuumio/miniconda3

# Copy and create Conda environment
COPY environment.yml .
RUN conda env create -f environment.yml

# Activate the environment for subsequent RUN/CMD
SHELL ["conda", "run", "-n", "mlenv", "/bin/bash", "-c"]

# Set working directory
WORKDIR /app
COPY . .

CMD ["python", "train.py"]

environment.yml

name: mlenv
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.9
  - pandas
  - numpy
  - scikit-learn

Tips

  • Version Dockerfile and environment.yml in Git.
  • Use docker-compose for multi‑service setups.
  • Prefer mambaforge for faster installs.

Challenges

  • Create a Conda environment, install 3 ML packages, and export it as environment.yml.
  • Install a Jupyter kernel for your Conda environment and test in JupyterLab.
  • Write a Dockerfile to build a basic ML image with Pandas and Scikit‑learn.
  • Run the Docker container and verify dependencies inside.
  • Combine Conda + Docker: build an image using your exported environment.yml.
  • Document your setup in README.md so others can reproduce.
← Back to MLOps Roadmap