Docker Introduction (Part 1): Virtualization & Containerization

By Alexander Eriksson · · Updated


As a self-taught programmer, I've often found that online resources jump straight into practical applications without explaining the fundamental concepts first. This is also true with Docker. Many tutorials show you how to set up an entire web server in a container, but they leave you wondering, "What is an image?" This guide is the first in a series on Docker aimed to anyone who has ever felt that gap in their knowledge. The goal is to build a solid understanding of how Docker works from the ground up, so you'll be prepared to configure things on your own, long after the tutorial ends. When you've finished this guide, you'll have a better understanding of Docker than most of its users.

Virtualization w/ Hypervisor

Virtualization is the process of allocating portions of the CPU and memory, and providing virtualized access to I/O devices, to create a separate (virtual) operating system (a 'virtual machine').

The program that runs and manages the lifecycle of this virtual machine is called the hypervisor. Examples of these hypervisors are VMWare and Virtualbox.

flowchart LR     subgraph HOST         direction LR         CPU[(CPU)]         MEM[(MEM)]         IO[(I/O)]         style HOST fill:#e8f4f8,stroke:#b0c4de,stroke-width:2px,color:#333     end     subgraph HYPERVISOR         direction LR         VM{{VM}}         OS[[OS]]         style HYPERVISOR fill:#f0fff0,stroke:#90ee90,stroke-width:2px,color:#333     end     style CPU fill:#add8e6,stroke:#6a5acd,stroke-width:2px,color:#333     style MEM fill:#add8e6,stroke:#6a5acd,stroke-width:2px,color:#333     style IO fill:#add8e6,stroke:#6a5acd,stroke-width:2px,color:#333         style VM fill:#f08080,stroke:#8b0000,stroke-width:2px,color:#333     style OS fill:#f08080,stroke:#8b0000,stroke-width:2px,color:#333     CPU --> HYPERVISOR     MEM --> HYPERVISOR     IO --> HYPERVISOR

So what you do through virtualization is taking a small part of your CPU and memory and allocate this (memory & CPU) to the virtualized environment. Basically creating a fresh (virtual) environment inside your (physical) computer. A ridiculous example (which is entirely possible) is running a new "virtual" computer inside your existing comuter and repeating this process, until you run out of memory or CPU:

flowchart TD subgraph A[Physical Computer] subgraph B[Virtual Machine] subgraph C[VM inside VM] subgraph D[VM inside VM inside VM] end end end end style A fill:#add8e6,stroke:#000,stroke-width:2px,color:#333 style B fill:#f08080,stroke:#8b0000,stroke-width:2px,color:#fff style C fill:#90ee90,stroke:#006400,stroke-width:2px,color:#333 style D fill:#dda0dd,stroke:#4b0082,stroke-width:2px,color:#333

Of course, you could also create multiple virtual machines on the same host:

flowchart TD subgraph A[Physical Computer] subgraph VM1[Virtual Machine 1] end subgraph VM2[Virtual Machine 2] end subgraph VM3[Virtual Machine 3] end end style A fill:#add8e6,stroke:#000,stroke-width:2px,color:#333 style VM1 fill:#f08080,stroke:#8b0000,stroke-width:2px,color:#fff style VM2 fill:#90ee90,stroke:#006400,stroke-width:2px,color:#333 style VM3 fill:#dda0dd,stroke:#4b0082,stroke-width:2px,color:#333

You can however imagine that this creates quite the overhead, in terms of usage and performance, and the more VM's you run on the same system, the less resources you have left on your "physical" machine to be used.

For running smaller applications or microservices, creating an isolated environment thorugh a virtual machine is however often a bit too much and an alternative solution is to use containerization through services like Docker.

flowchart TD subgraph HOST["host"] direction LR SUBP["subprocesses"] style HOST fill:#e8f4f8,stroke:#b0c4de,stroke-width:2px,color:#333 style SUBP fill:#add8e6,stroke:#6a5acd,stroke-width:2px,color:#333 CHROOT["chroot"] RLIMIT["rlimit"] style CHROOT fill:#f08080,stroke:#8b0000,stroke-width:2px,color:#333 style RLIMIT fill:#f08080,stroke:#8b0000,stroke-width:2px,color:#333 SUBP --> CHROOT SUBP --> RLIMIT end

How Containerization Differs from Virtualization

While virtualization creates entirely separate operating systems, containerization on the other hand focuses on isolating the processed while still running on the same host kernel. Instead of emulating the hardware and running multiple full operating systems, containers creates a lightweight environment that package only what the application needs: binaries, libraries and dependencies.

Imagine we have an Ubuntu machine and we want to run two other environments (e.g. Debian & Alpine) inside this machine. Then we have two options:

  • Virtualization: We separate a portion of our CPU and memory and delegate that to a new Debian VM. Then we do the very same again for delegating to an Alpine VM. These VM's runs its own full operating system and kernel, separate from the Ubuntu host.
  • Containerization: Instead of starting a new kernel, we create a container that runs Debian’s userland (binaries and libraries) while still sharing the Ubuntu host’s kernel. From inside the container it looks like Debian, but under the hood it’s the same Linux kernel as the host.
flowchart TB subgraph V["Virtualization"] direction TB Host1["Ubuntu Host OS"] HV["Hypervisor"] GuestOS["Debian OS - with its own kernel"] App1["Application"] Host1 --> HV --> GuestOS --> App1 end %% Styles style V fill:#f0fff0,stroke:#90ee90,stroke-width:2px,color:#333 style Host1 fill:#add8e6,stroke:#000,color:#333 style HV fill:#ffe4b5,stroke:#8b4513,color:#333 style GuestOS fill:#f08080,stroke:#8b0000,color:#fff style App1 fill:#dda0dd,stroke:#4b0082,color:#333
flowchart TB subgraph C["Containerization"] direction TB Host2["Ubuntu Host OS (Kernel shared)"] Docker["Container Runtime (Docker)"] Cont1["Debian Container (userland + App)"] Cont2["Alpine Container (userland + App)"] Host2 --> Docker --> Cont1 Docker --> Cont2 end %% Styles style C fill:#f0f8ff,stroke:#4682b4,stroke-width:2px,color:#333 style Host2 fill:#add8e6,stroke:#000,color:#333 style Docker fill:#ffe4b5,stroke:#8b4513,color:#333 style Cont1 fill:#90ee90,stroke:#006400,color:#333 style Cont2 fill:#dda0dd,stroke:#4b0082,color:#333

Something like VMWare therefore runs a full operating system (guest OS) on top of the host OS or directly on the hardware). Docker instead has container-based virtualization which runs at the application level using containers. Much more lightweigt compared to VMs. By now you should start to see the benefits of containerization over virtualization if we are only running smaller services such as a webapp.

To understand Docker containers—we need to understand what an image is

Now that we have an idea what containerization is, what the heck is an image? Images are a very important part of understanding how to work with Docker, so let's create a base understanding of how images work so that you independently can work with them.

As we previously said, containerization is somewhat different to virtualization as it does not run a full operating system (OS). Instead, it runs a lightweight filesystem made out of layers.

The fact that an image is built out layers hopefully comes to you a bit more naturally if we make an analogy of how we think of "images" in everyday life.

Imagine we are creating a photoshop file which is composed of multiple transparent layers, which are all stacked on top of each other. We have shapes, text, or maybe parts of a photo. Docker images runs on the very same principle — because we have layers here as well! Each layer in a Docker image represents some specific changes to our filesystem, like adding a file or maybe installing some dependency. The outcome of adding these layers on top of one another is what we call an image.

Constructing our first image

Before we start creating actually useful images, it is a good idea to show the very basics of how to build an image.

Let us create our first Dockerfile:

mkdir -p ~/src/sandbox/my_first_docker_project && # create dir
cd $_ # go into the newly created dir
touch Dockerfile # create the Dockerfile

In this directory, let's also create a file called hello.txt and fill this file with some text:

echo "My favourite TV show is Seinfeld." > hello.txt

(you'll soon find out why we created this text file).

Now that we have created our first Dockerfile, what's the purpose of this file? Each line in a Dockerfile is an instruction that Docker will follow to build an image, which by now we have an idea of what it is.

We always start our Dockerfile with a base image. Since this is our first Docker project, we are going to use a very minimal base image called alpine. It is tiny (~5 MB) but includes some basic Linux utilities, making it perfect for simple examples.

# Start from a minimal base
# This directive pulls the image from DockerHub
# Link: https://hub.docker.com/_/alpine
FROM alpine:3.18

# Add a simple file
COPY hello.txt /hello.txt

# Default command
# Print the content of our file
CMD ["cat", "/hello.txt"]

The FROM statement is not doing anything magical. It works on the same principle as other tools you are likely familiar with, such as GitHub: it pulls the alpine image from Docker Hub, which is like cloning a minimal Linux environment ready to use.

So to make it perfectly clear, we earlier said that an image is like building layers on top of one another. To make sure that we are not starting from absolute scratch, we are pulling an image (which already contains a couple of layers) and then we add our own layers on top of this image, adapting the existing image to our need. In fact, you can see the specific layers that this image contains at Docker Hub: Alpine Image (latest).

The next directive, COPY hello.txt /hello.txt, copies our local hello.txt file into the root (/) of the container. This way, the file is included inside the image and will be available when the container runs.

Finally, the CMD ["cat", "/hello.txt"] instruction sets the entrypoint of the container. This tells Docker what to do when we run the container: in this case, print the contents of hello.txt.

This is the simplest working image we can build, and it demonstrates how layers are created for each instruction in a Dockerfile. In the next section, we’ll dive deeper into how Docker uses these layers to build images efficiently.

Building our first image and running it with a container

Now that we have our Dockerfile (our recipe) ready, we can use this to create our first image

# Uses the docker "build" directive 
# and names our image "my-first-image"
docker build -t my-first-image .

Given that you made no errors, this image is now successfully built. By running docker image ls you can list all the images (recipes) currently existing on your system:

❯ docker image ls
REPOSITORY                  TAG         IMAGE ID       CREATED         SIZE
my-first-image              latest      86507d9501c6   2 minutes ago   7.36MB

Now that we have built our first image, we are ready to run it. Think of running the container like cooking the dish using the recipe.

# Note: We use the --rm flag to immeditely remove the container
# so it won't be persistent on our system here.
❯ docker run --rm -t my-first-image
My favourite TV show is Seinfeld.

Docker takes our recipe (my-first-image) and "cooks" it. Just like following a recipe, the container produces exactly what we asked for. In this case, printing the contents of hello.txt.

We can even peek inside the container to see that our “ingredients” really exist. By running an interactive shell:

❯ docker run -it --rm my-first-image sh

Now that we are in the container, we can simply run ls to confirm that our recipe has been cooked properly. We should expect the hello.txt to be inside our base image (the Alpine image)

/ # ls -l
total 60
drwxr-xr-x    2 root     root          4096 Feb 13  2025 bin
drwxr-xr-x    5 root     root           360 Aug 16 11:02 dev
drwxr-xr-x    1 root     root          4096 Aug 16 11:02 etc
-rw-rw-r--    1 root     root            34 Aug 16 10:29 hello.txt
drwxr-xr-x    2 root     root          4096 Feb 13  2025 home
drwxr-xr-x    7 root     root          4096 Feb 13  2025 lib
drwxr-xr-x    5 root     root          4096 Feb 13  2025 media
drwxr-xr-x    2 root     root          4096 Feb 13  2025 mnt
drwxr-xr-x    2 root     root          4096 Feb 13  2025 opt
dr-xr-xr-x  553 root     root             0 Aug 16 11:02 proc
drwx------    1 root     root          4096 Aug 16 11:02 root
drwxr-xr-x    2 root     root          4096 Feb 13  2025 run
drwxr-xr-x    2 root     root          4096 Feb 13  2025 sbin
drwxr-xr-x    2 root     root          4096 Feb 13  2025 srv
dr-xr-xr-x   13 root     root             0 Aug 16 11:02 sys
drwxrwxrwt    2 root     root          4096 Feb 13  2025 tmp
drwxr-xr-x    7 root     root          4096 Feb 13  2025 usr
drwxr-xr-x   12 root     root          4096 Feb 13  2025 var

which we can see is clearly the case.

Wrapping Up

With this foundation, you now understand the difference between virtualization and containerization, how Docker images are constructed from layers, and how containers run isolated environments.

Next up, we are looking into how to deal with volumes and persistent data.

Back to Blog