A practical introduction to Docker for web developers
During my time on a team using Docker I've had to on-board a number of engineers who are completely unfamiliar with containerisation. While there are lots of guides to myriad the individual components you'll need to work with Docker, I find developers struggle to understand the relationship and the boundaries between the pieces.
This article will give a short overview of the key parts of the Docker ecosystem, how they fit together and, crucially, why you need each bit. This is intended to help you get a map of the terrain, and allow you to join a Docker team and be productive quickly. However, working with Docker is complicated, and, in order to get the best out of it, I highly encourage you to dig in and learn more about each of the components as you continue your journey.
What is Docker?
Docker is a containerisation technology which allows you to build isolated, reproducible application environments which you can use to develop applications, then push those same environments into production. Containers work similarly to virtual machines, with the key difference being that virtual machines emulate physical hardware, whereas Docker only provides an abstraction over user-space - the result being that Docker containers have a smaller performance overhead than full VM virtualisation (YMMV on non-Linux hosts).
https://docs.docker.com/engine/docker-overview/
Images, containers and volumes
The key unit of Docker is images. Images are immutable file systems packaged up alongside some run-time configuration, which are built by running a Dockerfile
. Dockerfile
s contain a mixture of RUN
shell commands, which build up the file system in layers by snapshotting the state after each command, and Docker commands which configure networking, environment variables, default command entry points and some other bits.
https://docs.docker.com/engine/reference/builder/
Public images
It's fairly uncommon for you to completely write a Dockerfile
from scratch - it's more likely that you'll want to use an existing image as a template. For example, if you have a Python web application, you'd probably start from the public, official python
base image:
https://hub.docker.com/_/python
The Python base images provide you with an OS with a given version of Python installed. You can then extend this image by including any additional libraries you depend on, adding your application source code and then configuring your app's entry point command.
Having a base starting point isn't the only reason why public images are useful, they also act as an easy way to install and run software. For example, rather than go through the process of installing and managing Postgres on your development machine, and then installing and managing the same version in production, you can just use the PostgreSQL
Docker image for your chosen version.
There a plenty of public images, the most common source is Docker Hub:
https://hub.docker.com/search?q=&type=image
Run time configuration
Once an image is running, it becomes a container
, in the same way as in OO an instantiated class
is an instance
. When you run a Docker image you can add or modify much of the image's configuration by providing arguments to the docker run
command. For example, if you're using the PostgreSQL
image, you might want to override the port that it runs on, or the ENV
variable which the image uses to set the DB password.
https://docs.docker.com/engine/reference/run/
Storing container data with bind-mounts and volumes
As I mentioned to above, an image is a file system built from immutable (i.e. read only) layers. However, running containers often need somewhere to write data to.
Bind-mounts
One way to have a writeable filesystem inside a container is to mount a directory from your host machine inside the container at a given mount point, in much the same way as you might mount a network attached device inside a unix filesystem. This is called a bind-mount in docker parlance.
One common use-case for bind mounts is in development to mount the application source code that you're working on. This allows you to edit your application code on your host machine and immediately have the changes synced into the container without requiring you to rebuild the image.
https://docs.docker.com/storage/bind-mounts/
Volumes
Sometimes you want a persistent data volume for your container, but you don't really care where that data lives on the host. For example, if you have a running PostgreSQL container, you want Postgres to be able to write its database data somewhere, and you probably want this to persist across multiple container runs. You could achieve this using a bind-mount by putting aside a directory on your host and storing the data there. However, you don't really care about having the content visible on your host machine and this means managing this directory yourself.
For this use-case, Docker has volumes
. Volumes
are persistent directories mounted in much the same way as bind-mounts, but Docker takes care of the creation, location, and cleanup of the directory on the host for you.
https://docs.docker.com/storage/volumes/
Docker vs. docker-compose
In the course of developing an application, it is likely that you'll be running and coordinating multiple containers at the same time. For example, in a typical Python web application, you probably want at least the following running:
- Your custom built web application container, likely a custom image based on a public image like
python:3.7
- A database container, e.g.
PostgreSQL
- A cache storage provider, e.g.
redis
Furthermore, as you've now seen, there are lots of options you can provide to run a Docker container. Managing all these by hand, and sharing the knowledge across your development team is pretty obviously unsustainable. That's where docker-compose
comes in. docker-compose
allows you to write a YAML manifest file (by default called docker-compose.yml
) which describes each of the "services" (i.e. image + running config) your application consists of. You then interact with docker-compose
which reads the manifest and runs the docker commands on your behalf.
For example, you could write a manifest file which described our Python + Postgres + Redis application, then simply run docker-compose up
in the same directory as your manifest to fetch, build and run all the containers required for the application.
https://docs.docker.com/compose/
Common commands
That about covers the main concepts you'll need to know coming onto a team using Docker for development. Here are some common commands:
# Build or pull all images required for services in
# ./docker-compose.yml
# https://docs.docker.com/compose/reference/build/
docker-compose build
# Run a specific service in ./docker-compose.yml
# https://docs.docker.com/compose/reference/run/
docker-compose run <service-name>
# Run the given shell command inside the specified service
# (rather than the image's default run command)
docker-compose run <service-name> <shell command>
# bring up everything in ./docker.compose.yml
# equivalent to docker-compose build && docker-compose run
# https://docs.docker.com/compose/reference/up/
docker-compose up
# check running status of services in ./docker.compose.yml
docker-compose ps
# stop all services in ./docker.compose.yml
docker-compose stop
# stop all services in ./docker.compose.yml,
# then delete all your local containers, volumes and networks.
# This is the nuclear option - `stop` is more commonly used
docker-compose down
# same as `run`, but connects to a running container,
# rather than spawning a new one
docker-compose exec <service-name> <shell command>
# Docker only allocates a fixed amount of space for images and
# containers, which in the course of development will likely
# fill up. If you're seeing 'no disk space' errors,
# use this to clean up
# https://docs.docker.com/engine/reference/commandline/system_prune/
docker system prune