Initial commit from B2 archive
This commit is contained in:
179
docs/plans/2026-01-22-session-manager-design.md
Normal file
179
docs/plans/2026-01-22-session-manager-design.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# Claude Session Manager for `cw`
|
||||
|
||||
## Overview
|
||||
|
||||
Add a pre-flight check at the start of `cw` that monitors running Claude sessions and enforces a tiered response based on system load and session count. Prevents CPU overload from too many concurrent Claude instances and helps identify/clean orphaned processes.
|
||||
|
||||
## Problem Statement
|
||||
|
||||
- Multiple Claude instances running simultaneously cause sustained 100% CPU
|
||||
- User intentionally multitasks across projects but loses track of session count
|
||||
- Orphaned Claude processes (from closed terminals) pile up and waste resources
|
||||
|
||||
## Tiered Logic
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ cw() starts │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Count running Claude processes + get their details │
|
||||
│ Get system 5-min load average │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ Sessions >= 4 OR │
|
||||
│ Load avg >= 60%? │
|
||||
└─────────┬───────────┘
|
||||
yes │ no
|
||||
┌───────────────┴───────────────┐
|
||||
▼ ▼
|
||||
┌──────────────────┐ ┌─────────────────┐
|
||||
│ HARD BLOCK │ │ INFO ONLY │
|
||||
│ Must kill one │ │ Show sessions │
|
||||
│ to proceed │ │ (if any) │
|
||||
└──────────────────┘ │ Continue auto │
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
### Rules Summary
|
||||
|
||||
| Condition | Behavior |
|
||||
|-----------|----------|
|
||||
| Load < 60% AND sessions < 4 | Show running sessions as FYI, continue to category menu |
|
||||
| Load >= 60% OR sessions >= 4 | Hard block - must kill a session to proceed |
|
||||
|
||||
## Session Display Format
|
||||
|
||||
Each session shown as:
|
||||
```
|
||||
/opt/clients/acme-corp (2h 15m, 47% CPU)
|
||||
/opt/infra/project-selector (ORPHAN) (45m, 12% CPU)
|
||||
```
|
||||
|
||||
Orphans flagged with `(ORPHAN)` tag so user knows they're safe to kill.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Detecting Claude Sessions
|
||||
|
||||
```bash
|
||||
# Get Claude processes with: PID, CPU%, elapsed time, controlling TTY, working directory
|
||||
ps -eo pid,pcpu,etime,tty,args | grep -E "^[0-9].*claude$" | grep -v grep
|
||||
```
|
||||
|
||||
For each PID, get the working directory via:
|
||||
```bash
|
||||
readlink -f /proc/<PID>/cwd
|
||||
```
|
||||
|
||||
### Detecting Orphans
|
||||
|
||||
A Claude process is orphaned if:
|
||||
- Its controlling TTY is `?` (no terminal attached), OR
|
||||
- Its parent PID is 1 (adopted by init)
|
||||
|
||||
Check parent PID via:
|
||||
```bash
|
||||
ps -o ppid= -p <PID>
|
||||
```
|
||||
|
||||
### Load Average
|
||||
|
||||
Use the 5-minute load average from `/proc/loadavg`, normalized to CPU count:
|
||||
|
||||
```bash
|
||||
load=$(awk '{print $2}' /proc/loadavg)
|
||||
cpus=$(nproc)
|
||||
percent=$(awk "BEGIN {printf \"%.0f\", ($load / $cpus) * 100}")
|
||||
```
|
||||
|
||||
### Kill Flow
|
||||
|
||||
1. User selects session from fzf menu
|
||||
2. Send `SIGTERM` to PID
|
||||
3. Wait up to 5 seconds, checking if process died
|
||||
4. If still alive, send `SIGKILL`
|
||||
5. Confirm death, refresh session list
|
||||
|
||||
## User Experience Flow
|
||||
|
||||
### Hard Block Scenario (>=4 sessions OR >=60% load)
|
||||
|
||||
```
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
⚠️ Cannot start new Claude session
|
||||
Load: 73% (5-min avg) | Sessions: 4 running
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
|
||||
Select a session to kill (or Esc to cancel):
|
||||
|
||||
> /opt/clients/acme-corp (2h 15m, 47% CPU)
|
||||
/opt/pre-revenue/newapp (45m, 23% CPU)
|
||||
/opt/infra/project-selector (ORPHAN) (1h 02m, 3% CPU)
|
||||
/opt/fun/gamedev (12m, 18% CPU)
|
||||
```
|
||||
|
||||
After selection:
|
||||
```
|
||||
Stopping /opt/clients/acme-corp (PID 10764)...
|
||||
Terminated gracefully.
|
||||
|
||||
Proceeding to project selection...
|
||||
```
|
||||
|
||||
### Info Only Scenario (<4 sessions AND <60% load)
|
||||
|
||||
```
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
📊 Load: 34% (5-min avg) | Sessions: 2 running
|
||||
• /opt/clients/acme-corp (2h 15m, 47% CPU)
|
||||
• /opt/pre-revenue/newapp (45m, 23% CPU)
|
||||
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
||||
```
|
||||
|
||||
Then immediately shows the category menu (no user action needed).
|
||||
|
||||
### No Sessions Running
|
||||
|
||||
Skip the status display entirely, go straight to category menu.
|
||||
|
||||
## Code Structure
|
||||
|
||||
### New Helper Functions
|
||||
|
||||
| Function | Purpose |
|
||||
|----------|---------|
|
||||
| `_cw_get_claude_sessions` | Returns array of running Claude PIDs with metadata |
|
||||
| `_cw_format_session` | Formats a session for display (dir, runtime, CPU%, orphan tag) |
|
||||
| `_cw_get_load_percent` | Returns 5-min load avg as percentage of CPU capacity |
|
||||
| `_cw_kill_session` | Graceful kill with SIGTERM→wait→SIGKILL fallback |
|
||||
| `_cw_session_check` | Main orchestrator: gathers data, decides tier, shows UI |
|
||||
|
||||
### Integration Point
|
||||
|
||||
At the very top of `cw()`, after dependency checks:
|
||||
|
||||
```bash
|
||||
cw() {
|
||||
# Dependency checks (existing)
|
||||
...
|
||||
|
||||
# NEW: Session/load check
|
||||
_cw_session_check || return 0
|
||||
|
||||
# Rest of existing code
|
||||
...
|
||||
}
|
||||
```
|
||||
|
||||
`_cw_session_check` returns:
|
||||
- `0` (success) - proceed with `cw`
|
||||
- `1` (failure) - user cancelled at hard block, exit `cw`
|
||||
|
||||
### File Changes
|
||||
|
||||
Only `cw.sh` needs modification - all new functions added to the same file.
|
||||
Reference in New Issue
Block a user