30 Days of MLOps Challenge · Day 4

Reproducible ML environments using Conda & Docker

By Aviraj Kawade · June 14, 2025 · 5 min read

Use Conda for package‑level reproducibility and Docker for system‑level consistency to eliminate “works on my machine” problems.

← Previous: Day 3 Back to Roadmap Next: Day 5 →

💡 Hey — It's Aviraj Kawade 👋 🧠 New to DevOps? Start 60 Days of DevOps

Key Learnings

Why environment reproducibility matters in ML.
Conda for managing Python environments and dependencies.
Docker for portable, consistent runtime environments.
How Conda and Docker complement each other.

Environment Reproducibility in ML

Recreate the exact same environment—software versions, dependencies, and system libraries—so models behave identically across machines and time.

Consistent results across train/test/prod
Reliable experimentation and comparisons
Team collaboration without setup issues
Fewer "works on my machine" bugs
Streamlined CI/CD and debugging

Common tools: Conda/Virtualenv, Docker, pip + requirements.txt, MLflow/DVC.

Conda for Managing Python Environments

Conda isolates per‑project dependencies, supports non‑Python packages and CUDA stacks, and enables easy export/import of entire environments.

Install Conda

Linux/macOS (x86)

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# macOS ARM
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh

Windows (PowerShell)

wget "https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe" -outfile ".\Downloads\Miniconda3-latest-Windows-x86_64.exe"

Core workflow

# Create env
conda create -n ml-env python=3.10 numpy pandas scikit-learn jupyter
# Activate
conda activate ml-env
# Install deps
conda install matplotlib seaborn jupyterlab
conda install -c conda-forge xgboost
# Deep learning
conda install -c pytorch pytorch torchvision torchaudio
conda install -c conda-forge tensorflow
# Jupyter kernel
pip install ipykernel
python -m ipykernel install --user --name ml-env --display-name "Python (ml-env)"
# Export / Re-create
envname=ml-env
conda env export -n %envname% > environment.yml
conda env create -f environment.yml
# Remove
conda remove -n ml-env --all

Example Project Structure

my-ml-project/
├── data/
├── notebooks/
├── src/
├── environment.yml
└── README.md

Best Practices

Commit environment.yml to Git.
Prefer conda-forge channel for ML stacks.
Use conda-lock or Docker for tighter reproducibility.
Install pip packages last if needed.

Docker for Portable ML Environments

Why Docker?

Portability: package code + environment as a single image.
Consistency: immutable builds prevent drift.
Reproducibility: identical across dev/test/prod.

1) Dockerfile

# Base image with Python
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# System deps
RUN apt-get update && apt-get install -y \
    build-essential \
    git \
    wget \
    && rm -rf /var/lib/apt/lists/*

# Copy project
COPY . .

# Python deps
RUN pip install --no-cache-dir -r requirements.txt

# Default command
CMD ["python", "train.py"]

2) requirements.txt

numpy
pandas
scikit-learn
matplotlib
jupyterlab
tensorflow

3) Build & Run

# Build
docker build -t ml-env:latest .
# Interactive dev
docker run -it --rm -v $(pwd):/app ml-env:latest
# Jupyter Lab
docker run -it -p 8888:8888 -v $(pwd):/app ml-env:latest jupyter lab --ip=0.0.0.0 --allow-root

4) .dockerignore

__pycache__/
*.pyc
.env
data/
models/

5) docker-compose (optional)

version: '3'
services:
  ml:
    build: .
    volumes:
      - .:/app
    ports:
      - "8888:8888"
  mongo:
    image: mongo:latest
    ports:
      - "27017:27017"

Pro Tips

Pin versions in requirements.txt
Use lightweight base images
Mount data via volumes; keep images lean
Use env vars for secrets/config

Conda vs Docker

Feature	Conda	Docker
Scope	Python/R envs and packages	Full OS‑level env and deps
Speed	Fast local setup	Slower to build images
Isolation	Language/package level	System‑level isolation
Portability	OS‑dependent quirks	Highly portable
Best use	Notebooks & prototyping	Deployment & CI/CD

Docker + Conda Together

Combine Conda for dependency management with Docker for system reproducibility.

Dockerfile

FROM continuumio/miniconda3

# Copy and create Conda environment
COPY environment.yml .
RUN conda env create -f environment.yml

# Activate the environment for subsequent RUN/CMD
SHELL ["conda", "run", "-n", "mlenv", "/bin/bash", "-c"]

# Set working directory
WORKDIR /app
COPY . .

CMD ["python", "train.py"]

environment.yml

name: mlenv
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.9
  - pandas
  - numpy
  - scikit-learn

Tips

Version Dockerfile and environment.yml in Git.
Use docker-compose for multi‑service setups.
Prefer mambaforge for faster installs.

Learning Resources

Challenges

Create a Conda environment, install 3 ML packages, and export it as environment.yml.
Install a Jupyter kernel for your Conda environment and test in JupyterLab.
Write a Dockerfile to build a basic ML image with Pandas and Scikit‑learn.
Run the Docker container and verify dependencies inside.
Combine Conda + Docker: build an image using your exported environment.yml.
Document your setup in README.md so others can reproduce.

← Back to MLOps Roadmap