Table of Contents:
Dive Into the Deep
Welcome to the sixth article of my technical articles series! Today, we're diving into the exciting world of Docker containers. Get ready to uncover a method for leveraging exploitation against Docker containers. Iβll assume youβve already got a basic of the fundamental concepts of Docker, so letβs roll up our sleeves and get started!
Exploring the setcap Command
The setcap command is used in Linux to set file capabilities, which allow you to grant specific privileges to executables. Unlike traditional root permissions, file capabilities enable a more granular approach, assigning only the necessary privileges to a process. This enhances security by reducing the need to grant full root access.
For example, sudo setcap 'cap_net_bind_service=+ep' /path/to/myapp β allows the executable myapp to bind to ports below 1024 without needing full root permissions.
So At its core, Linux capabilities are subset of root permissions given to processes or executables within the Linux kernel. These privileges allow for a more granular assignment of permissions instead of just granting them all at once.
Exploring Docker Modes
Docker containers can operate in two distinct modes: User (Normal) mode and Privileged mode, each with varying levels of access and control, Two different modes in action and the level of access each mode has to the host.
Containers 1 run in user/normal mode, interacting with the OS through the Docker Engine, While Container 2 runs in privileged mode, bypassing the Docker Engine and directly communicating with the OS. Privileged containers have more direct access to system resources.
Docker Access: Groups, Privileges, and the Mighty --privileged Flag!
- A normal user needs to be in the
dockergroup to run Docker commands withoutsudo. - Adding the user to the
dockergroup allows them to execute Docker commands likedocker run <image>without requiring elevated privileges. - Privileged Mode used Command:
docker run --privileged -it your_image_name - The
--privilegedflag grants additional capabilities to containers (e.g., hardware access, system-level features). - Root or
sudoprivileges are still required to use the--privilegedflag, even if the user is in thedockergroup. - A user in the
dockergroup withoutsudocannot run commands with the-privilegedflag.
If a container has privileged access to the operating system, it can execute commands as root on the host. Using utilities like capsh from the libcap2-bin package, we can list the container's capabilities to understand the syscalls it can make and potential exploitation mechanisms.
The Mysteries of Syscalls
We need to understand what syscalls and cgroups are before starting the exploitation phase, so letβs begin explaining what is syscall
syscall (system call) is a way for programs to interact with the operating system kernel. It allows user-level applications to request services such as file operations, process control, and network communication from the kernel
Example of how to use a system call in a Linux shell to list files in a directory:
#!/bin/bash
# List files in the current directory
echo "Listing files in the current directory:"
lsThis script uses the ls command to interact with the kernel and list files in the directory. It's a straightforward example of how user-level applications can request services from the kernel using system calls.
Demystifying the cgroup Command
Control groups (cgroups) in Linux manage and limit the resources that processes can use. They help control CPU, memory, and other resources to ensure the system runs efficiently.
Typical control group (cgroup) directory might look like in a Linux system
Each subsystem (blkio, cpu, cpuacct, devices, memory) represents a different resource controller and contains files that control various aspects of process resource usage within that cgroup
Taming the Beast By Editing cgroups
When you create a new cgroup directory, it will automatically include certain files. For example, after creating the directory /sys/fs/cgroup/my_cgroup, the directory will typically contain the following files by default:
/sys/fs/cgroup/my_cgroup
βββ cgroup.procs
βββ cpu.cfs_period_us
βββ cpu.cfs_quota_us
βββ notify_on_release
βββ release_agent
βββ tasksHere's an example of setting the CPU quota inside the cpu.cfs_quota_us file:
mkdir /sys/fs/cgroup/my_cgroup # First, create the cgroup directory:
cd /sys/fs/cgroup/my_cgroup # Change to the new directory:
echo 50000 > cpu.cfs_quota_us
echo 100000 > cpu.cfs_period_usNow we will explain some cgroup Files and How They Work with Specific Commands
echo 1 > /tmp/cgrp/x/notify_on_releaseThe notify_on_release file is used to notify the kernel when the cgroup is released β meaning that the processes assigned to that cgroup have completed their execution or have been terminated, You write 1 to this file to enable the notification.
echo "$host_path/exploit" > /tmp/cgrp/release_agentThe release_agent file specifies a script or binary to be executed when the cgroup is released. So, when the cgroup finishes, the kernel is first notified via notify_on_release, and then it runs the script specified in release_agent.
Only the process added to the cgroup (echo \\$\\$ > /tmp/cgrp/x/cgroup.procs) will be affected and will trigger the release_agent when it exits.
Example of what might be written inside the cgroup.procs file
12345
67890
11223
44556The script targets a specific process, not all processes and only the process that you add to the cgroup will trigger the release agent upon exiting.
Checking Container Capabilities
Listing capabilities of a privileged Docker Container:
semo@privilegedcontainer:~$ capsh --print
Current: = cap_chown, cap_sys_module, cap_sys_chroot, cap_sys_admin, cap_setgid, cap_setuidIn the example exploit below, we are going to use the mount syscall (as allowed by the container's capabilities) to mount the host's control groups into the container.
cap_sys_admin: A broad capability that includes many administrative privileges, including the ability to mount and unmount filesystems.
Detailed Exploitation of Privileged Containers
The code snippet below is based upon (but a modified version of) the Proof of Concept (PoC) created by Trailofbits, which details the inner workings of this exploit well.
Exploitation Explanation
- Create a group using the Linux kernel to
manage and execute our exploit. We'll usecgroupsto manage processes, and mount it to/tmp/cgrpon the container as root. - To run our exploit, we'll notify the kernel by writing
1to/tmp/cgrp/x/notify_on_release, indicating it should execute our code when thecgroupfinishes. - Determine where the container's files are stored on the host and save this path as a variable.
- Echo the container's file location into
/exploitand set it in therelease_agent, which thecgroupwill execute upon release. - Create a
shell scriptfor our exploit on the host. - Add a
command to the script to copy the host's flaginto a file namedflag.txtin thecontainer upon execution. - Make the
/exploitscript executable. - This command writes the process ID (
PID) of the current shell to thecgroup.procsfile located in/tmp/cgrp/x/. It effectively adds the shell process to a control group (cgroup) for resource management. - We create a
processand store that into/tmp/cgrp/x/cgroup.procs. When the process isreleased, the contents will be executed.
Groovy Summary of Exploit Flow
- We will write process ID to
cgroup.procsto include it in thecgroup. - The process runs and exits naturally, making the
cgroup.procsempty. notify_on_releaseflag triggers the release agent when thecgroupis empty.release_agentruns the exploit script to perform the specified actions.
Note: if you want topracticeon this vulnerability you can check outTryHackMeContainer Vulnerability Module β
TryHackMe TryHackMe | Cyber Security Training
Byiee
I hope you found this article both useful and insightful. If you have any related suggestions or recommendations, feel free to connect with me on LinkedIn