When it comes to deep learning model development and training, personally for me, the majority of the time is spent on data pre-processing, then for setting up the development environment. Cloud based development environments such as Azure DLVM, Google CoLab etc. are very good options to go with when you don’t have much time to spend on installing all the required packages for your workstation. But, there are times that we want to do the development on our machines and train/deploy in another place (may be on the client’s environment, for a machine with a better GPU for faster training or to train on a Kubernetes cluster). Docker comes handy in these scenarios.
Docker provides both hardware and software encapsulation by allowing portable deployment. If you are a data scientist/ machine learning guy or a deep learning developer, I strongly recommend you to give it a try with docker and I’m pretty sure that’ll make your life so easy.
Alright! That’s about docker! Let’s assume now you are using docker for deploying your deep learning applications and you want to use docker to ship your deep learning model to a remote computer that is having a powerful GPU, which allows you to use large mini-batch sizes and speedup your training process. Though docker containers solve the problem of framework dependencies and platform dependencies it is also hardware-agnostic. This creates a problem!
Have you ever tried to access the GPU resource on the host computer from a program running inside a docker container? Sadly, Docker does not natively support NVIDIA GPUs within containers.
The early work around was installing the Nvidia drivers inside the docker container. It’s bit of a hassle as the driver version installed in the container should match the driver on the host.
For making docker images that uses GPU resources more portable, Nvidia has introduced nvidia-docker!
Nvidia-docker is a wrapper around the docker command that mounts the GPU on the host machine with the docker container. The only thing you should pay your attention is the CUDA version you want to use.
So, in which scenarios you can use this? In my case, nvidia-docker comes handy for me when I’m running my experiment on a cluster which is having a higher GPU power. What I do is just containerize all my code into a docker and run on the remote with nvidia-docker. (Windows guys… nvidia-docker is not still available for windows hosts. Not sure if that is in the development timeline or not 😀 )
Here’s the official GitHub on nvidia-docker. Just install it at make sure to restart your docker engine and make sure nvidia-docker the default docker run-time. Then rest is the same as building and running a typical docker.
Here’s a simple docker file I wrote for containerizing my PyTorch code. I’ve used CUDA 9.1. You can modify this for your need.
Here’s the bash command used for running the docker using the Nvidia run-time.
Just try it and see how your deep learning life becomes easy! Happy coding! 🙂