Monday, January 11, 2016

Docker Filesystems: Understanding the btrfs Backend

The basis of the filesystem use in Docker is the storage backend abstraction.[14] A storage backend allows you to store a set of layers each addressed by a unique name.

Various storage backends are supported in Docker filesystems:[1]
  • vfs backend
  • devicemapper backend[21]
  • btrfs backend
  • aufs backend
In this article, we will discuss docker filesystems in general and btrfs backend in specific.

Images and Containers


A core part of the Docker model is the efficient use of layered images and containers:
  • Images
    • Each Docker image on the system is stored as a layer, with the parent being the layer of the parent image. 
      • To create such an image a new layer is created (based on the right parent) and then the changes in that image are applied to the newly mounted filesystem.
    • Docker images have intermediate layers that increase reusability, decrease disk usage, and speed up docker build by allowing each step to be cached. These intermediate layers are not shown by default in the "docker images" command.
      • Each layer is a filesystem tree that can be mounted[2] when needed and modified. New layers can be started from scratch, but they can also be created with a specified parent.
  • Containers
    • Docker containers are isolated mini Linux environments built from Docker images, base images with zero or more filesystem layers on top of them.
As shown below, there are 1 container and 118 images in this docker installation.  In its storage backend, btrfs is the configured storage driver,[13] which will be the focus of this article:

# docker info
Containers: 1
Images: 118
Storage Driver: btrfs

To retrieve low-level information on a container or image, you can use "docker inspect" command which takes a required ID argument (either container's or image's).  You can use "docker ps" to find the ID of a specific container or use "docker images" to list the IDs of all images.

Base Image


Base images are typically minimal operating system images and the layers on top of them are added by developers to create convenience images (such as an image which already has Java SE installed and configured) for direct use or for use as building blocks.

Each container is related to a top image which is built up from layers of images starting from a base image.

To find the top image associated with a container, type:

# docker inspect --format "{{ .Image }}" ce483e532466
eca6affff525415c7e2199f1e8b2222ffce31d4bcf4a0cd05a48807d2c1f7647


To find the layers of images that a container is built up from, type:

# docker history eca6affff525415c7e2199f1e8b2222ffce31d4bcf4a0cd05a48807d2c1f7647
IMAGE               CREATED             CREATED BY                                      SIZE
eca6affff525        4 days ago          /bin/sh -c #(nop) WORKDIR /u01/app              0 B
ccf8bd04df89        4 days ago          /bin/sh -c #(nop) ENV APP_HOME=/u01/app/        0 B
c91b83e8c828        4 days ago          /bin/sh -c #(nop) USER [apaas]                  0 B
ab06ea65ece3        4 days ago          /bin/sh -c chown -R apaas:apaas /u01/           9.146 MB
2354b0ad9541        4 days ago          /bin/sh -c #(nop) ADD dir:ff4334d8629caee02b1   9.144 MB
246fb66aa39e        4 days ago          /bin/sh -c chmod -R +x /u01/scripts/            1.383 kB
21c5ddd9b74c        4 days ago          /bin/sh -c #(nop) COPY dir:17f42381efa361f6c6   1.383 kB
c347b96af5be        4 days ago          /bin/sh -c mkdir -p /u01/scripts /u01/logs      0 B
00c1fc450430        4 days ago          /bin/sh -c #(nop) USER [root]                   0 B
f52b843cf97e        7 weeks ago         /bin/sh -c mv java java.orig && chmod +x ./ja   7.718 kB
7c6d6279239c        7 weeks ago         /bin/sh -c #(nop) USER [apaas]                  0 B
5c6ad3a0ad33        7 weeks ago         /bin/sh -c mkdir -p /u01/logs && chown -R apa   306.5 MB
0100a4922bfb        7 weeks ago         /bin/sh -c #(nop) WORKDIR /u01/jdk/jdk1.7.0_9   0 B
4983e8502db6        7 weeks ago         /bin/sh -c #(nop) ADD file:3511bd6019a189ef28   226 B
266e209d77d3        7 weeks ago         /bin/sh -c #(nop) ENV PATH=/u01/jdk/jdk1.7.0_   0 B
c00eef371809        7 weeks ago         /bin/sh -c #(nop) ENV JAVA_HOME=/u01/jdk/jdk1   0 B
db5d61324db8        7 weeks ago         /bin/sh -c #(nop) ADD file:babe1a2cf183ba22e4   306.5 MB
10287b34527b        5 months ago        /bin/sh -c groupadd apaas && useradd -g apaas   296.1 kB
035a8c863461        5 months ago        /bin/sh -c mkdir -p /u01/jdk/ && mkdir -p /u0   0 B
a555d44630e2        10 months ago       /bin/sh -c #(nop) CMD [/bin/bash]               0 B
23a9eb33093d        10 months ago       /bin/sh -c #(nop) ADD file:33b9447cdbd58ef81b   195.1 MB
7258693d533e        10 months ago       /bin/sh -c #(no/p) MAINTAINER Oracle Linux Pro   0 B


Note that "eca6affff525" is the top image which is built on top of "ccf8bd04df89" and so on.  The base image is "7258693d533e", which doesn't have a parent.  For example, if you display the parent of the base image, it displays nothing (i.e., no patent):

# docker inspect --format "{{ .Parent }}" 7258693d533e
<blank>

The btrfs Backend


The brtfs backend requires /var/lib/docker to be on a btrfs filesystem and uses the filesystem level snapshotting to implement layers.

# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvdb4       11G  7.1G  2.5G  75% /var/lib/docker
# mount -l
/dev/xvdb4 on /var/lib/docker type btrfs (rw)
<snipped>

You can find the layers of the images in the folder /var/lib/docker/btrfs/subvolumes.  Each layer is stored as a btrfs subvolume inside the folder  and start out as a snapshot of the parent subvolume (if any).

This backend is pretty fast. Mounting /var/lib/docker on a different filesystem than the rest of your system is recommended in order to limit the impact of filesystem corruption.

Image Cleanup


One of the purposes of learning Docker filesystems and storage backends is to assure you know what you are doing before cleaning up unwanted images.[15,16]

For example, before removing an image, no containers can be using it (running or stopped). After you've assured that, these commands can cleanup untagged images (see also filtering) or all images.

# batch cleanup untagged images 
docker rmi $(docker images -q -f "dangling=true")

# remove all images by id 
docker rmi $(docker images -aq)

References

  1. Supported Filesystems (Docker)
  2. Concept of Mounting
    • The concept of mounting allows programs to be agnostic about where your data is structured
    • From an application (or user) point of view, the file system is one tree. Under the hood, the file system structure can be on a single partition, but also on a dozen partitions, network storage, removable media and more.
  3. Displaying Physical Volumes (Redhat)
  4. Docker - How to analyze a container's disk usage? (good)
  5. Finding all storage devices attached to a Linux machine
  6. /dev/dm-1 (block device)
    • dev/dm-1 is for "device mapper n.1". Basically, it is a logical unit carved out using the kernel embedded device mapper layer. From a userspace application point of view, it is a RAW block device.
  7. Linux file system
  8. Docker images command
  9. Docker cp command
    • You can copy to or from either a running or stopped container.
    • Behavior is similar to the common Unix utility cp -a in that 
      • directories are copied recursively with permissions preserved if possible. 
      • Ownership is set to the user and primary group on the receiving end of the transfer.  For example, 
        • Files copied to a container will be created with UID:GID of the root user. 
        • Files copied to the local machine will be created with the UID:GID of the user which invoked the docker cp command.
      • It is not possible to copy certain system files such as resources under /proc,/sys, /dev, and mounts created by the user in the container.
  10. Understanding Volumes in Docker (good) 
  11. Docker Volume Manager
  12. Docker Quicksheet 
  13. Storage Driver (Docker)
    • A storage driver is how docker implements a particular union file system. 
    • Keeping with are “batteries included, but replaceable” philosophy, Docker supports a number of different union file systems. 
      • For instance, Ubuntu’s default storage driver is AUFS, where for Red Hat and Centos it’s Device Mapper.
  14. Docker Images
    • Docker images are stored as series of read-only layers. 
    • When we start a container, Docker takes the read-only image and adds a read-write layer on top. 
    • If the running container modifies an existing file, the file is copied out of the underlying read-only layer and into the top-most read-write layer where the changes are applied. 
      • The version in the read-write layer hides the underlying file, but does not destroy it — it still exists in the underlying image. 
    • When a Docker container is deleted, relaunching the image will start a fresh container without any of the changes made in the previously running container — those changes are lost. 
    • Docker calls this combination of read-only layers with a read-write layer on top a Union File System.
  15. Why is docker image eating up my disk space that is not used by docker
  16. Docker error : no space left on device
  17. docker ps -s
    • -s, --size=false Display total file sizes
  18. Advanced Docker Volumes
  19. Resizing Docker containers with the Device Mapper plugin
  20. Question on Resource Limits? (Docker)
  21. devicemapper - a storage backend based on Device Mapper
  22. Docker: Btrfs Storage in Practice (Xml and More)