Appearance
Pod Shutdown: Graceful vs. Forceful Termination (SIGTERM & SIGKILL)
What is the difference between SIGTERM and SIGKILL in Kubernetes?
The difference between SIGTERM and SIGKILL lies in the distinction between a "graceful" shutdown request and a "forceful" immediate termination. Understanding how the kubelet orchestrates these Linux signals is critical for ensuring zero-downtime deployments and preventing data corruption.
1. SIGTERM (Signal 15): The Graceful Request
SIGTERM is the standard Unix signal used to politely ask a process to stop. In Kubernetes, it represents the start of the "graceful termination" period.
- Trigger: When a Pod is marked for deletion (e.g., via
kubectl delete, a scale-down event, or a new deployment rollout), the Kubelet sendsSIGTERMto the main process (PID 1) in each container. - Purpose: It grants the application time to perform critical cleanup tasks before exiting. Examples include:
- Saving state to disk or flushing buffers.
- Closing database connections securely.
- Finishing in-flight HTTP requests (e.g., an NGINX server stopping new connections but serving active ones).
- Application Responsibility: The application must explicitly handle this signal! If the code receives
SIGTERM, it should stop accepting new work, finish current transactions, and then exit with a success code (0).
PreStop Hooks & Timing
SIGTERM is sent after any configured preStop hooks have completed. If a preStop hook hangs, SIGTERM is not sent until the hook finishes or the total grace period expires.
2. SIGKILL (Signal 9): The Forceful Stop
SIGKILL is the kernel-level signal that immediately terminates a process. It cannot be caught, ignored, or blocked by the application code.
- Trigger: The Kubelet sends
SIGKILLonly if the container is still running after the termination grace period has expired. - Purpose: It is the fail-safe mechanism to ensure Pods are actually removed from the Node. It prevents "zombie" pods from lingering indefinitely if they ignore the
SIGTERMrequest or hang deadlocked during cleanup. - Consequence: The process is stopped abruptly by the kernel. No cleanup code executes. Active connections instantly drop (often causing HTTP 502/504 errors for clients), data in memory is lost, and file buffers do not flush to disk.
3. The Kubernetes Termination Sequence
Kubernetes orchestrates this dance using the terminationGracePeriodSeconds setting (which defaults to 30 seconds). The exact sequence of events when a pod is deleted is:
- PreStop Hook: If configured in your YAML, the kubelet executes this hook first. The time spent here counts against your total grace period.
- SIGTERM: Once the PreStop hook finishes,
SIGTERMis sent toPID 1. - Grace Period Wait: The Kubelet starts a timer and waits.
- SIGKILL: If the container process has not exited by the end of the
terminationGracePeriodSeconds,SIGKILLis sent to forcibly murder the process.
Summary Comparison
| Feature | SIGTERM | SIGKILL |
|---|---|---|
| Type | Graceful Termination Request | Forceful Immediate Shutdown |
| Handling | Caught and handled explicitly by your code | Cannot be caught or ignored |
| K8s Timing | Sent immediately (or after preStop) | Sent after terminationGracePeriodSeconds expires |
| Intended Action | Save state, close connections, exit safely | Stop execution immediately by the Linux Kernel |
| Customization | Can be overridden via lifecycle.stopSignal | Hardcoded fail-safe; always the final step |
Architectural Considerations & Pitfalls
The "PID 1" Trap
If your container entrypoint (PID 1) is a shell script (e.g., #!/bin/sh) or a binary that does not forward signals to its child processes, your actual application will never receive the SIGTERM. It will helplessly wait until the 30-second grace period expires and get abruptly destroyed by SIGKILL. This is the #1 cause of dropped connections during rolling updates.
Custom Stop Signals
If your application expects a different signal to gracefully shut down (for example, NGINX traditionally uses SIGQUIT for graceful exits instead of SIGTERM), you can tell Kubernetes to send that signal instead using the pod spec:
yaml
spec:
containers:
- name: my-app
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"] # Optional delay
# Tell Kubelet to use SIGQUIT instead of SIGTERM
stopSignal: SIGQUIT