用Docker搭建机器学习环境

告别手动编译玄学Cuda和Cudnn

Posted by tianhaoo on February 20, 2019 本文阅读量

先决条件

docker

安装docker的官网教程

https://docs.docker.com/install

有需要的话可以再装一个docker-compose

https://docs.docker.com/compose/install/#install-compose

nvidia-docker

github上安装nvidia-docker的教程

https://github.com/NVIDIA/nvidia-docker

tensorflow

tensorflow官网对docker的支持已经非常完善,只要照着官网进行操作即可

https://www.tensorflow.org/install/docker

实际运行时的命令如下

docker run --runtime=nvidia -it --rm -p 8888:8888 -v /home/tian/docker/docker-volumes/tensorflow-projects:/notebooks/projects/ --name tensorflow --net=host --env HTTP_PROXY="http://127.0.0.1:8118" --env HTTPS_PROXY="https://127.0.0.1:8118" tensorflow/tensorflow:latest-gpu-py3

其中指定了映射的目录卷,容器名字,以及本地ss+privoxy搭建的代理等信息。

这个docker镜像是tensorflow官方提供的,集成了常用的py3的包、jupyter等,使用非常方便。

pytorch

pytorch也是常用的类似tensorflow的工具,由于其使用简单、方便debug等特点而颇受青睐,在其官网的安装教程中使用cuda、conda之类的安装步骤比较详尽,但对于docker构建镜像却只字未提。所以想用docker搭建一个像tensorflow官方提供的镜像那样方便好用的镜像还是比较费劲的,我根据网上搜到的极为前辈的经验,整合了三个不同的pytorch镜像放在了dockerhub上,下面是具体的dockerfile,读者可根据自己需要修改和编译。

https://cloud.docker.com/repository/docker/tianhaoo/pytorch-jupyter-gpu

  1. pytorch-gpu
FROM nvidia/cuda:9.0-base-ubuntu16.04

# Install some basic utilities
RUN apt-get update && apt-get install -y \
    curl \
    ca-certificates \
    sudo \
    git \
    bzip2 \
    libx11-6 \
 && rm -rf /var/lib/apt/lists/*

# Create a working directory
RUN mkdir /app
WORKDIR /app

# Create a non-root user and switch to it
RUN adduser --disabled-password --gecos '' --shell /bin/bash user \
 && chown -R user:user /app
RUN echo "user ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/90-user
USER user

# All users can use /home/user as their home directory
ENV HOME=/home/user
RUN chmod 777 /home/user

# Install Miniconda
RUN curl -so ~/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-4.5.1-Linux-x86_64.sh \
 && chmod +x ~/miniconda.sh \
 && ~/miniconda.sh -b -p ~/miniconda \
 && rm ~/miniconda.sh
ENV PATH=/home/user/miniconda/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false

# Create a Python 3.6 environment
RUN /home/user/miniconda/bin/conda install conda-build \
 && /home/user/miniconda/bin/conda create -y --name py36 python=3.6.5 \
 && /home/user/miniconda/bin/conda clean -ya
ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/home/user/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH

# CUDA 9.0-specific steps
RUN conda install -y -c pytorch \
    cuda90=1.0 \
    magma-cuda90=2.3.0 \
    "pytorch=0.4.1=py36_cuda9.0.176_cudnn7.1.2_1" \
    torchvision=0.2.1 \
 && conda clean -ya

# Install HDF5 Python bindings
RUN conda install -y h5py=2.8.0 \
 && conda clean -ya
RUN pip install h5py-cache==1.0

# Install Torchnet, a high-level framework for PyTorch
RUN pip install torchnet==0.0.4

# Install Requests, a Python library for making HTTP requests
RUN conda install -y requests=2.19.1 \
 && conda clean -ya

# Install Graphviz
RUN conda install -y graphviz=2.38.0 \
 && conda clean -ya
RUN pip install graphviz==0.8.4

# Install OpenCV3 Python bindings
RUN sudo apt-get update && sudo apt-get install -y --no-install-recommends \
    libgtk2.0-0 \
    libcanberra-gtk-module \
 && sudo rm -rf /var/lib/apt/lists/*
RUN conda install -y -c menpo opencv3=3.1.0 \
 && conda clean -ya

# Set the default command to python3
CMD ["python3"]
  1. pytorch-jupyter-cpu
FROM jupyter/datascience-notebook:latest

RUN conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/ \
	&& conda config --set show_channel_urls yes \
	&& conda install pytorch torchvision -c pytorch \
	&& conda install pytorch torchvision -c pytorch \
	&& conda install pytorch torchvision -c pytorch \
	&& conda install pytorch torchvision -c pytorch \
    && conda clean -ya

  1. pytorch-jupyter-gpu
# from https://github.com/anibali/docker-pytorch cuda-9.0
FROM tianhaoo/pytorch-gpu:1.0 

RUN conda install -y jupyter \
    && conda install -y jupyter \
    && conda install -y jupyter \
    && conda clean -ya

CMD ["python3"]