Runaway Process Killer
Automatic protection against runaway CPU and RAM processes on Ubuntu servers. Prevents hosting provider throttling by killing processes that pin CPU to 95%+ for extended periods or exhaust available memory.
Overview
| Protection | Tool | Trigger | Action |
|---|---|---|---|
| CPU | monit | 95%+ CPU for 5 minutes | Kill all processes matching top CPU consumer |
| RAM | earlyoom | <5% free memory | Kill highest memory consumer |
| Orphan Claude | monit | Every 15 minutes | Kill Claude processes with no TTY or PPID=1 |
Requirements
- Ubuntu 20.04+ (tested on 24.04 LTS)
- Root access
- ~6MB RAM overhead total
Quick Install
curl -fsSL https://git.upfrontops.cloud/upfrontops/runaway-process-killer/raw/branch/main/install.sh | sudo bash
Or clone and run:
git clone https://git.upfrontops.cloud/upfrontops/runaway-process-killer.git
cd runaway-process-killer
sudo ./install.sh
Manual Installation
1. Install earlyoom (RAM Protection)
sudo apt update && sudo apt install -y earlyoom
Edit /etc/default/earlyoom:
EARLYOOM_ARGS="-m 5 -s 5 --avoid '(^|/)(init|systemd|sshd)$' -r 60"
Enable and start:
sudo systemctl enable earlyoom && sudo systemctl restart earlyoom
2. Install monit (CPU Protection)
sudo apt install -y monit
Edit /etc/monit/monitrc, change daemon interval:
set daemon 60
Enable HTTP interface (required for monit status):
set httpd port 2812 and
use address localhost
allow localhost
allow admin:monit
Create /etc/monit/conf.d/cpu-killer:
check system $HOST
if cpu usage > 95% for 5 cycles then exec "/usr/local/bin/kill-top-cpu.sh"
Create /usr/local/bin/kill-top-cpu.sh:
#!/bin/bash
# Kill the process tree using the most CPU (excluding critical ones)
# Find the top CPU consumer (excluding protected and transient processes)
TOP_LINE=$(ps -eo pid,comm,%cpu --sort=-%cpu | grep -v -E '(PID|systemd|sshd|monit|earlyoom|bash|ps|awk|grep|head)' | head -1)
TARGET_PID=$(echo "$TOP_LINE" | awk '{print $1}')
COMM=$(echo "$TOP_LINE" | awk '{print $2}')
if [ -n "$TARGET_PID" ] && [ -n "$COMM" ]; then
logger "monit cpu-killer: Killing all '$COMM' processes (detected high CPU on PID $TARGET_PID)"
# Kill all processes with this command name
pkill -9 -x "$COMM"
fi
Make executable and enable:
sudo chmod +x /usr/local/bin/kill-top-cpu.sh
sudo systemctl enable monit && sudo systemctl restart monit
Configuration
CPU Threshold Timing
Edit /etc/monit/conf.d/cpu-killer to change timing:
| Cycles | Time (at 60s interval) |
|---|---|
| 2 | 2 minutes |
| 5 | 5 minutes (default) |
| 10 | 10 minutes |
| 30 | 30 minutes |
RAM Threshold
Edit /etc/default/earlyoom:
| Setting | Meaning |
|---|---|
-m 5 |
Kill when free RAM < 5% |
-m 10 |
Kill when free RAM < 10% |
-s 5 |
Kill when free swap < 5% |
Protected Processes
earlyoom protects (via --avoid):
- init, systemd, sshd
kill-top-cpu.sh protects (via grep exclusion):
- systemd, sshd, monit, earlyoom, bash, ps, awk, grep, head
To add more protected processes, edit the grep pattern in /usr/local/bin/kill-top-cpu.sh.
Monitoring
Check Status
# earlyoom status
sudo systemctl status earlyoom
# monit status
sudo monit status
# Combined check
sudo ./scripts/status.sh
View Logs
# earlyoom logs
journalctl -u earlyoom -f
# monit logs
tail -f /var/log/monit.log
# Kill events
journalctl | grep -i "cpu-killer\|earlyoom\|orphan-claude"
Testing
Test CPU Killer
# Install stress tool
sudo apt install -y stress
# For quick testing, temporarily set to 2 cycles in /etc/monit/conf.d/cpu-killer
# then reload: sudo monit reload
# Start CPU stress (will be killed after threshold)
stress --cpu 4 --timeout 300
Test RAM Killer
# This will be killed quickly by earlyoom
stress --vm 4 --vm-bytes 4G --vm-keep --timeout 120
Test Orphan Claude Killer
# Run the detection script manually to see what it would find
sudo /usr/local/bin/kill-orphan-claude.sh
# Check logs for any kills
journalctl | grep orphan-claude-killer
Uninstall
sudo ./uninstall.sh
Or manually:
sudo systemctl stop earlyoom monit
sudo systemctl disable earlyoom monit
sudo apt remove -y earlyoom monit
sudo rm -f /usr/local/bin/kill-top-cpu.sh
sudo rm -f /usr/local/bin/kill-orphan-claude.sh
sudo rm -f /etc/monit/conf.d/cpu-killer
sudo rm -f /etc/monit/conf.d/orphan-claude-killer
Resource Overhead
| Component | RAM | CPU | Disk |
|---|---|---|---|
| earlyoom | ~2MB | Negligible (adaptive polling) | 77KB |
| monit | ~3-4MB | ~28ms per 60s cycle | 1MB |
| kill script | 0 (runs only when triggered) | Milliseconds | <1KB |
Total: ~6MB RAM, essentially 0% CPU during normal operation
How It Works
Orphan Claude Detection (monit)
A Claude process is considered orphaned if:
- Its controlling TTY is
?(no terminal attached), OR - Its parent PID is 1 (adopted by init)
- monit runs the orphan detection script every 15 cycles (15 minutes with 60s daemon interval)
- Script finds all
claudeprocesses viapgrep -x claude - For each process, checks TTY (
ps -o tty=) and PPID (ps -o ppid=) - If orphaned, kills the process tree (children first, then parent)
- Logs details to syslog including PID, reason, start time, CPU%, and memory%
CPU Protection (monit)
- monit checks system CPU every 60 seconds
- If CPU > 95% for 5 consecutive checks (5 minutes), executes kill script
- Kill script identifies the process using most CPU
- Kills ALL processes with that command name (handles multi-worker processes)
- Logs the action to syslog
RAM Protection (earlyoom)
- earlyoom monitors available memory (adaptive polling - more frequent when memory is low)
- When free memory drops below 5%, sends SIGTERM to highest memory consumer
- If process doesn't exit, sends SIGKILL at 2.5% threshold
- Protected processes (init, systemd, sshd) are never killed
Troubleshooting
monit not triggering
# Check monit is running
sudo systemctl status monit
# Check config syntax
sudo monit -t
# Check monit log
tail -f /var/log/monit.log
# Verify CPU threshold is being detected
sudo monit status | grep cpu
earlyoom not killing
# Check earlyoom is running
sudo systemctl status earlyoom
# Check configuration
cat /etc/default/earlyoom
# Watch real-time
journalctl -u earlyoom -f
Kill script not working
# Test manually
sudo /usr/local/bin/kill-top-cpu.sh
# Check script is executable
ls -la /usr/local/bin/kill-top-cpu.sh
# Check for errors
bash -x /usr/local/bin/kill-top-cpu.sh
License
MIT License - Use freely, no warranty.
Author
Created for UpfrontOps infrastructure management.