Ever Wondered how Docker Build works behind the scenes
Docker Storage Driver: OverlayFS
Overlay filesystems, also known as OverlayFS or union filesystems, enable users to create a layered structure of their file systems and directories. This technique is widely used with containers. In this post, I’ll provide a short introduction to where OverlayFS is used, and then I’ll show you how you can use it on the command line.
Container images can vary widely in size; while some are quite small, like Alpine Linux is 2.5MB, others, such as Ubuntu 16.04, can be around 27MB, and the Anaconda Python distribution is 800MB to 1.5GB.
When you start a container with an image, it essentially begins with a blank slate, as if it had made a copy of the image exclusively for that container’s use. However, for larger container images like the 800MB Anaconda distribution, copying the entire image would be inefficient in terms of both disk space and speed. Therefore, Docker doesn’t make direct copies; instead, it an overlay.
How Overlays work
In short, overlay filesystems let you mount a filesystem using 2 directories: a lower
directory, and an upper
directory.
mount -t overlay overlay -o lowerdir=/lower,upperdir=/upper,workdir=/work /merged
- the
lower
directory of the filesystem is read-only - the
upper
directory of the filesystem can be both read to and written from - the
merged
(overlay) is bothlower
andupper
combined together
The two are essentially just random folders with example files, but one could represent a file system, and the other could be a file that we’re merging on top of it.
Below are some notes on the changes one might encounter using this setup:
- When a process reads a file in the
merged
directory, the overlayfs filesystem driver looks in the upper directory and reads the file from there if it’s present. Otherwise, it looks in thelower
directory. - When a process writes a file in the
merged
directory, overlayfs will write it to both theupper
andmerged
directories. - When a process writes a file in the
upper
directory, overlayfs will write it to both theupper
andmerged
directories. - When a process writes a file in the
lower
directory, overlayfs will write it to both thelower
andmerged
directories. - When a process removes a file from the
merged
directory, overlayfs will only delete the file in themerged
directory. However, in theupper
directory, this file becomes a character device, which I suppose is how the overlayfs driver represents a file being deleted. This file is also referred to as a whiteout. - When a process removes a file from the
lower
directory, overlayfs will delete it from the read-onlylower
directory. - When a process removes a file from the
upper
directory, overlayfs will delete it from both theupper
andmerged
directories.
Multiple layers
Docker images are often composed of like 25 layers. Overlayfs supports having multiple lower directories, so you can run
mount -t overlay overlay
-o lowerdir:/dir1:/dir2:/dir3:...:/dir25,upperdir=...
So I assume that’s how containers with many Docker layers work, it just unpacks each layer into a separate directory and then asks overlayfs to combine them all together with an empty upper directory that the container will write its changes to it.
Conclusion
In conclusion, Docker’s use of OverlayFS offers a flexible solution for managing containerized file systems, allowing for efficient layering and storage management. By leveraging OverlayFS, users can efficiently merge multiple directories into a unified file system while optimizing disk space and performance. Understanding how OverlayFS works behind the scenes provides valuable insights into Docker’s storage architecture and enhances containerization workflows for developers and system administrators alike.
To stay current with the latest cloud technologies, make sure to subscribe to my weekly newsletter, Cloud Chirp. 🚀