Skip to content

Enhancement: trigger /compact on context size (token %), not only tool-call count #2155

@jbegarek

Description

@jbegarek

Summary

scripts/hooks/suggest-compact.js currently triggers purely on cumulative tool-call count (first at COMPACT_THRESHOLD/50, then every 25). Tool count is a weak proxy for actual window pressure, and adding a context-size signal would make the suggestion fire when it actually matters.

Why tool count alone is weak

  • A few large operations (big file reads, large MCP tool responses) can fill a large fraction of the window in very few tool calls — the count signal stays silent.
  • Conversely, many tiny calls can cross 50 while the window is barely used — it nags early.
  • In MCP/Bash-heavy sessions the count climbs without reflecting real context growth.

Proposal

Add a second (primary) signal based on the real context size, read from the latest assistant turn's usage in the session transcript. transcript_path is already provided in the PreToolUse hook stdin payload.

  • Sum input_tokens + cache_read_input_tokens + cache_creation_input_tokens from the most recent usage record (these three partition the prompt, so the sum is the true context size of the turn).
  • Compare to a window-scaled threshold (e.g. 250k on a 1M-context model, 160k on a 200k window; env-overridable), and re-remind every +N tokens (e.g. 60k).
  • Detect the window: treat as 1M if the model id carries a [1m] marker, or if observed tokens already exceed 200k (covers logs that drop the suffix); else 200k.
  • Surface via the same output({ hookSpecificOutput: { hookEventName: 'PreToolUse', additionalContext } }) path the count signal already uses (post-PreToolUse warn-only hooks: stderr output not visible to users #2082), so it stays non-blocking.

Reference sketch

function readContextTokens(transcriptPath) {
  const lines = fs.readFileSync(transcriptPath, 'utf8').split('\n');
  for (let i = lines.length - 1; i >= 0; i--) {
    const l = lines[i].trim(); if (!l) continue;
    let rec; try { rec = JSON.parse(l); } catch { continue; }
    const u = rec?.message?.usage; if (!u) continue;
    const t = (u.input_tokens||0)+(u.cache_read_input_tokens||0)+(u.cache_creation_input_tokens||0);
    if (t > 0) return { tokens: t, model: rec.message.model || '' };
  }
  return null;
}

Threshold crossing is tracked with a small per-session "bucket" so it only re-fires when context grows by another step.

I'm happy to open a PR if this direction is welcome.

Filed after auditing a local copy of the skill; surfaced via Claude Code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions