Lien de la note Hackmd

Joseph Chazalon, Clement Demoulins February 2021 EPITA Research & Development Laboratory (LRDE)

About this course

This is a course about containers using Docker

What it is
How to use it for simple, then less simple cases
Practice

Tools an grading

Graded content for each sessions using Moodle

For sessions 1 and 2: 10 min quiz on Moodle at the end of each session

Software stack illustrated

A real case of 2 incompatible software stacks we had to handle

Many solutions

Use libs with forward/backwared compatibility
FIx bad dependency declarations in packages
Use lnagugae compatibility layer
Rebuild stuff manually
Install various versions of libs at different places

Dependency hell

When you have to rebuild manually, step by step, all your software stack checking each dependency

What are you paid for ?

Waht you really want is simply to separate:

your development & product software stack
your os & userland software stack

And what about deployement ?

Deployement challenge

Solutions

Containers
- Beaucoup plus leger
- Demarrage presque immediat
Virtual Machines
- Emuler le systeme d’exploitation
- Devoir reserver RAM, CPU, etc.
- On peut avoir plusieurs VMs sur une meme machine

Containers and virtual machines

are 2 good solutions to software stack isolation
have similar resource isolation and allocation benefits(CPU, mem, net & disk IO)
but function differently because
- containers virtualize the OS (the kernel)
- instead of hardware
so containers are
- lighter and faster than VMs (min storage)
- more portable
- less secure

Docker

Promises

lightweight
easy deployement

Benefits for dev

Build once… run anywhere

portable runtime env
no worries about missing dependencies
run each app in its own isolated container
Automate testing, integration, packaging

Befenits for admin

Configure once… run anything

Make the entire lifecycle more efficient, consistent and repeatable
Increase the quality of code

Docker adoption

Docker was launched in 2013 (8 years ago) and became a massive trend.

Github project search “docker” $\rightarrow$ $\gt$ 45 000 projects

Reasons for NOT using (Docker) containers (currently)

Archive your program (because it is not made for that)
Your program uses OSX primitives
~~Your program runs on Windows only~~
You need to deploy many containers on clusters

Demo1: VSCode Remote Container

Implementation of Docker Containers

Under the hood, Docker is built on the following components:

The Go programming language
The following features of the Linux kernel
- Namespaces
- groups
- capabilities

Namespaces

According to man namespaces:

A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that hey have their own isolated instance of the global resource. Changes to the global resource are visible to other processes that are members of the namespace, but are invisible to other processes. One use of namespaces is to implement containers.

Cgroups

According to man cgroups

Control groups, usually referred to as cgroups, are a Linux kernel feature which allow processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored. The kernel’s cgroup interface is provided through a pseudo-filesystem called cgroupfs. Grouping is implemented in the core cgroup kernel code, while resource tracking and limits are implemented in a set of per-resource-type subsystems (memory, CPU, and so on).

Capabilites

According to man capabilities:

Traditional UNIX implementations distinguish two categories of processes: privileged processes (whose effective user ID is 0, referred to as superuser or root), and unprivileged processes (whose effective UID is nonzero). Privileged processes bypass all kernel permission checks, while unprivileged processes are subject to full permission checking based on the process’s credentials (usu ally: effective UID, effective GID, and supplementary group list). Starting with kernel 2.2, Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled. Capabilities are a per-thread attribute.

Open Container Initiative runtime (container) specifications

Container configuration, lifecycle, and how to rpx them using JSON files.

creating
- the container is being created
created
- the runtime has finished the create operation, and the container process has neither exited nor executed the user-specified program
running
- the container process has executed the user-specified program but has not exited
stopped

Open Container Initiative image specifications

An image stores the files for the root FS of a container, ie the files or containerized program will see

Problem(s)

Many containers share the same basis (Ubuntum Alpine, Debian, etc.)
because we do not want to rebuild a complete software stajc by hand down to the kernel

Solution:

Split images into meaningful layers Ubuntu base, Python dependencies, App…
Share common layers between containers in read-only
Add a thin writable layer on top of this stack of layers
View this stack as a single, consistent and writable filesystem

Image Layers

Efficiency implemented using Copy-on-Write (COW)

Open Container Initiative distribution specifications

API protocol to facilitate distribution of images:

What is a repository
How to list, pull, push images
HTTP API

Images and containers

When using Dokcer, you think about images and containers

Good to remember

A (Docker) container is just:
- a root filesystem with some bind mounts containing all the software stack down to the kernel
- a control policy enforced by the kernel with some isolation mechanisms: PID, network, etc.
- some environment variables, kernel configuration and automatically generated file: for hostname, DNS resolution, etc.
- an abstract view of a group of processes not even a single kernel object

Using Docker

Regular workflow

Obtain an image

docker image pull USER/IMAGENAME:TAG
docker image import ARCHIVE
docker image build ...

Create a container for image

docker container create --name CONTAINER_NAME IMAGE

Start the container

docker container start CONTAINER_NAME

(opt.) execute more programs within the container
1 docker container exec CONTAINER_NAME
Attach your console to the container
1 docker container
manage/monitor the container

Container storage explained

Storage overview

Where is Docker data stored ?

Under var/lib/docker

Base image content

Bind mounts

Volumes

What

Shareable space management by Docker
Can bes used to share data between container
Create using docker volume create VOLNAME or --volume or --mount type=volume on start/run
Survive container removal: must be removed manually

Where

Stored under /var/lib/docker/volumes/+ name or unique id

Reusing volumes for another container

It is possible to mount volumes from another container. This can be convenient in several cases:

get a shell in a super minimal container
migrate a database
upgrade a container

Fragile isolation with host

Relies on kernel security
You can share a lot of things with host
Many public images run services as root

DEVI: Presentation