Skip to content

System Call Tracing: Strace in Containers

How do I trace system calls using strace in containers?

Tracing system calls (using tools like strace) in containers requires overcoming specific security isolation mechanisms designed to protect the host and other containers.

To successfully strace a process, you must address three layers of isolation: Capabilities, Process Namespaces, and Seccomp.


1. The Core Requirement: SYS_PTRACE Capability

System call tracing relies on the kernel's ptrace system call. By default, container runtimes in Kubernetes drop most Linux capabilities, including SYS_PTRACE, to reduce the cluster's attack surface.

  • The Problem: If you try to simply exec into a container and run strace, it will fail. The process lacks the permission to inspect or interrupt other processes.
  • The Solution: You must explicitly grant the SYS_PTRACE capability to the container performing the debugging.

The most modern and least disruptive way to trace a running Pod is using kubectl debug. This injects a special "ephemeral container" natively into the running Pod. Because standard containers lack debugging tools and privileges, you use a custom profile to elevate the debug container's permissions without having to restart the application causing the issue.

How to do it: Use the sysadmin profile, which automatically grants necessary privileges (like system administrator capabilities) to the newly spun-up debug container.

bash
# Target the pod, inject an Ubuntu image, attach to the app container namespace
kubectl debug -it my-pod \
  --image=ubuntu \
  --target=app-container \
  --profile=sysadmin
  • --target: This flag targets the process namespace of the existing container. It allows the debug container to physically see the processes running in your application container.
  • --profile=sysadmin: This applies a predefined profile that disables security restrictions, allowing the container to perform privileged operations like ptrace.

Once inside the debug shell, install strace, find the Process ID (PID) of your application, and attach to it:

bash
apt-get update && apt-get install -y strace
ps aux              # Find the PID of your app
strace -p <PID>     # Trace the process

3. Method B: Shared Process Namespaces (Manifest Base)

If your environment restricts ephemeral containers, you can statically configure a Pod to share process namespaces. This allows sidecar containers (like a dedicated debugger) to see and interact with the main application's processes.

  1. Enable Namespace Sharing: Set shareProcessNamespace: true in the Pod spec. This ensures all containers in the Pod share a single Process ID namespace.
  2. Grant Capabilities: The debugging sidecar must have the SYS_PTRACE capability added to its security context.

Manifest Example:

yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-debug
spec:
  shareProcessNamespace: true  # 1. Flattens process isolation
  containers:
  - name: nginx
    image: nginx
  - name: debugger
    image: ubuntu
    command: ["sleep", "3600"]
    securityContext:
      capabilities:
        add:
        - SYS_PTRACE           # 2. Grants permission to use ptrace/strace
    stdin: true
    tty: true

In this configuration, if you execute a shell in the debugger container, you will see the nginx processes. However, nginx will not be PID 1; it will be a standard process ID, and the pause infrastructure container will act as PID 1 to reap zombies.


4. Important Constraint: Seccomp Profiles

Even if you have successfully granted the SYS_PTRACE capability, strace may still fail if a Seccomp (Secure Computing Mode) profile is strictly enforcing syscall filtering on the node.

  • The Conflict: Seccomp restricts which system calls a process can make to the kernel. If the profile applied to the Pod (or the container runtime default) blocks the ptrace system call, strace will be denied access regardless of capability.
  • The Resolution: When using kubectl debug with --profile=sysadmin, Kubernetes natively creates a privileged ephemeral container. Privileged containers typically ignore AppArmor and Seccomp profiles entirely (running as Unconfined), allowing tools like strace to function correctly without interference.

Based on Kubernetes v1.35 (Timbernetes). Changelog.