Lien de la note Hackmd
Joseph Chazalon, Clement Demoulins February 2021 EPITA Research & Development Laboratory (LRDE)
About this course
This is a course about containers using Docker
- What it is
- How to use it for simple, then less simple cases
- Practice
Tools an grading
Graded content for each sessions using Moodle
- For sessions 1 and 2: 10 min quiz on Moodle at the end of each session
Software stack illustrated
A real case of 2 incompatible software stacks we had to handle
Many solutions
- Use libs with forward/backwared compatibility
- FIx bad dependency declarations in packages
- Use lnagugae compatibility layer
- Rebuild stuff manually
- Install various versions of libs at different places
Dependency hell
When you have to rebuild manually, step by step, all your software stack checking each dependency
What are you paid for ?
Waht you really want is simply to separate:
- your development & product software stack
- your os & userland software stack
And what about deployement ?
Deployement challenge
Solutions
- Containers
- Beaucoup plus leger
- Demarrage presque immediat
- Virtual Machines
- Emuler le systeme d’exploitation
- Devoir reserver RAM, CPU, etc.
- On peut avoir plusieurs VMs sur une meme machine
Containers and virtual machines
- are 2 good solutions to software stack isolation
- have similar resource isolation and allocation benefits(CPU, mem, net & disk IO)
- but function differently because
- containers virtualize the OS (the kernel)
- instead of hardware
- so containers are
- lighter and faster than VMs (min storage)
- more portable
- less secure
Docker
Promises
- lightweight
- easy deployement
Benefits for dev
Build once… run anywhere
- portable runtime env
- no worries about missing dependencies
- run each app in its own isolated container
- Automate testing, integration, packaging
Befenits for admin
Configure once… run anything
- Make the entire lifecycle more efficient, consistent and repeatable
- Increase the quality of code
Docker adoption
Docker was launched in 2013 (8 years ago) and became a massive trend.
- Github project search “docker” $\rightarrow$ $\gt$ 45 000 projects
Reasons for NOT using (Docker) containers (currently)
- Archive your program (because it is not made for that)
- Your program uses OSX primitives
Your program runs on Windows only- You need to deploy many containers on clusters
Demo1: VSCode Remote Container
Implementation of Docker Containers
Under the hood, Docker is built on the following components:
- The Go programming language
- The following features of the Linux kernel
- Namespaces
- groups
- capabilities
Namespaces
According to man namespaces:
A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that hey have their own isolated instance of the global resource. Changes to the global resource are visible to other processes that are members of the namespace, but are invisible to other processes. One use of namespaces is to implement containers.
Cgroups
According to man cgroups
Control groups, usually referred to as cgroups, are a Linux kernel feature which allow processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored. The kernel’s cgroup interface is provided through a pseudo-filesystem called cgroupfs. Grouping is implemented in the core cgroup kernel code, while resource tracking and limits are implemented in a set of per-resource-type subsystems (memory, CPU, and so on).
Capabilites
According to man capabilities:
Traditional UNIX implementations distinguish two categories of processes: privileged processes (whose effective user ID is 0, referred to as superuser or root), and unprivileged processes (whose effective UID is nonzero). Privileged processes bypass all kernel permission checks, while unprivileged processes are subject to full permission checking based on the process’s credentials (usu ally: effective UID, effective GID, and supplementary group list). Starting with kernel 2.2, Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled. Capabilities are a per-thread attribute.
Open Container Initiative runtime (container) specifications
Container configuration, lifecycle, and how to rpx them using JSON files.
- creating
- the container is being created
- created
- the runtime has finished the create operation, and the container process has neither exited nor executed the user-specified program
- running
- the container process has executed the user-specified program but has not exited
- stopped
Open Container Initiative image specifications
An image stores the files for the root FS of a container, ie the files or containerized program will see
Problem(s)
- Many containers share the same basis (Ubuntum Alpine, Debian, etc.)
- because we do not want to rebuild a complete software stajc by hand down to the kernel
Solution:
- Split images into meaningful layers Ubuntu base, Python dependencies, App…
- Share common layers between containers in read-only
- Add a thin writable layer on top of this stack of layers
- View this stack as a single, consistent and writable filesystem
Image Layers
Efficiency implemented using Copy-on-Write (COW)
Open Container Initiative distribution specifications
API protocol to facilitate distribution of images:
- What is a repository
- How to list, pull, push images
- HTTP API
Images and containers
When using Dokcer, you think about images and containers
Good to remember
- A (Docker) container is just:
- a root filesystem with some bind mounts containing all the software stack down to the kernel
- a control policy enforced by the kernel with some isolation mechanisms: PID, network, etc.
- some environment variables, kernel configuration and automatically generated file: for hostname, DNS resolution, etc.
- an abstract view of a group of processes not even a single kernel object
Using Docker
Regular workflow
- Obtain an image
1 2 3
docker image pull USER/IMAGENAME:TAG docker image import ARCHIVE docker image build ...
- Create a container for image
1
docker container create --name CONTAINER_NAME IMAGE
- Start the container
1
docker container start CONTAINER_NAME
- (opt.) execute more programs within the container
1
docker container exec CONTAINER_NAME
- Attach your console to the container
1
docker container
- manage/monitor the container
Container storage explained
Storage overview
Where is Docker data stored ?
Under var/lib/docker
Base image content
Bind mounts
Volumes
What
- Shareable space management by Docker
- Can bes used to share data between container
- Create using docker
volume create VOLNAME
or--volume
or--mount type=volume
on start/run - Survive container removal: must be removed manually
Where
- Stored under
/var/lib/docker/volumes/+
name or unique id
Reusing volumes for another container
It is possible to mount volumes from another container. This can be convenient in several cases:
- get a shell in a super minimal container
- migrate a database
- upgrade a container
Fragile isolation with host
- Relies on kernel security
- You can share a lot of things with host
- Many public images run services as root