Git Hooks Changed How You Write Code. AI Hooks Are Doing It Again.

In 2023, a developer set up a pre-commit hook. Then forgot about it.

Three years later, every time that developer writes code with failing tests and tries to commit, the hook still stops it. Silent. Automatic. Doesn’t care whether anyone remembers it exists.

Git hooks have a quiet superpower: they keep working even when everyone forgets about them. No need for a mental reminder before every commit — that job moved out of human memory and into the Git event system a long time ago.

So here’s the uncomfortable question: what about CLAUDE.md?

Twenty rules. Claude reads them at the start of the session. Then somewhere around token five hundred, rule #14 — “don’t push directly to main without a review” — is just a line of text. Not enforcement. The AI isn’t being sneaky. It’s doing what it thinks the task requires. But it gets distracted. It makes judgment calls. On a context-saturated afternoon, its interpretation drifts from the original rules.

Rules are suggestions. Hooks are enforcement.

Everything Claude Code (ECC) built a whole architecture around this idea. This is the third post in the ECC deep-dive series — after Autonomous Loops and the Instinct System, we’re now digging into the event-driven layer that makes the whole system tick.

Mogu butts in:

“Hook” is one of those words in programming that refuses to die — Windows message hooks, React’s useEffect, Git hooks, webhooks — all hooks, all built on the same core idea: at some event, insert your logic.
AI hooks inherit exactly the same concept. The event just changed. Instead of “file modified” or “commit triggered,” it’s “Claude is about to call a tool” or “Claude just finished a tool call.” Anyone who’s written a Git pre-commit hook already understands 90% of how AI hooks work.
So this article isn’t really introducing a new concept — it’s saying that thing developers have been using for years can now be applied to AI (⁠◕⁠‿⁠◕⁠)

The Sweet Spot of Trust: Two Hook Philosophies

Start with an uncomfortable question: do developers actually trust their AI assistants?

Full trust means hooks are pointless — just let it run. Full distrust means AI shouldn’t touch any tools at all. But reality isn’t binary. Most people’s position is: “Fine, let Claude do things, but don’t let it do anything irreversible and stupid.”

ECC splits that gray zone into two precise mechanisms.

PreToolUse is the security guard at the entrance — intercepts before Claude does something. Can block it, modify the input, or warn and let it through. The point is “don’t let it happen.”

PostToolUse is the auditor — analyzes after Claude does something. No blocking, but automatic logging, warnings, and follow-up actions. The point is “let it happen, but keep a record.”

They’re not mutually exclusive. The same event can get both — blocked up front if needed, audited afterward regardless. One says “hold on” before, the other says “noted” after.

Mogu chimes in:

The PreToolUse + PostToolUse combination encodes an implicit “human-AI trust calibration.” The amount of trust determines the number of hooks.
No hooks isn’t trust — it’s wishful thinking. Full blocking isn’t caution — it’s not using AI. The sweet spot in between has to be designed by each team. And that sweet spot moves over time — more hooks early on, then downgrading some from blocking to warning after three stable months.
It’s basically onboarding a new hire: day one, every decision gets checked. Three months later, only the big calls need review. The difference is, AI won’t feel insulted when a hook blocks it ヽ⁠(⁠°⁠〇⁠°⁠)⁠ﾉ

A Friday Afternoon Disaster Script

Picture this scenario.

Claude is running a task. It decides the task is complete. The logical next step is git push. So it pushes. No PR, no review, nobody watching. It’s Friday afternoon, that push triggers Vercel auto-deploy, and a buggy version goes straight to production.

Or another scenario: Claude runs pnpm dev to start a dev server, but without tmux. The developer closes the terminal window, the server dies, and then ten minutes get burned figuring out “why won’t localhost connect.”

These aren’t hypothetical. Every PreToolUse recipe in ECC corresponds to a real incident that actually happened.

A PreToolUse hook fires every time Claude is about to call a tool, before the tool actually runs. It can target a specific tool like Bash, or use * to intercept everything:

#!/bin/bash
# ~/.claude/hooks/pre-tool-guard.sh
# Environment variables injected by Claude Code: TOOL_NAME, TOOL_INPUT

# Only intercept Bash calls
if [[ "$TOOL_NAME" != "Bash" ]]; then
  exit 0
fi

# Block dev server outside tmux
if [[ -z "$TMUX" ]] && echo "$TOOL_INPUT" | grep -qE "npm run dev|pnpm dev|yarn dev|bun dev"; then
  echo "🚫 Dev server must run inside tmux. Open tmux first." >&2
  exit 2  # exit 2 = block this tool call
fi

# Block direct pushes
if echo "$TOOL_INPUT" | grep -qE "git push"; then
  echo "🚫 Direct push blocked. Open a PR and go through review." >&2
  exit 2
fi

Wire it into hooks.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "~/.claude/hooks/pre-tool-guard.sh"
          }
        ]
      }
    ]
  }
}

From this point on, Claude Code will never run a dev server outside tmux. It will never push directly. Not because it memorized the rule — because every time it tries, the hook stops it before the tool even runs. Rules don’t need to be remembered when hooks are more reliable than memory.

But Blocking Is Only Half the Story

PreToolUse solves “don’t do the stupid thing.” But some operations don’t need blocking — they need visibility after the fact.

In a single AI session, Claude might modify twenty files. The developer commits three times. The other seventeen intermediate states — gone. Nobody remembers they existed. Three days later, debugging a mysterious regression, the git log comes up empty because that change was never committed.

PostToolUse hooks fire after every tool call completes. They don’t block — they observe and record:

#!/bin/bash
# ~/.claude/hooks/post-tool-logger.sh
# Additional variables: TOOL_OUTPUT, EXIT_CODE

# Log failures
if [[ "$EXIT_CODE" != "0" ]]; then
  echo "[$(date -Iseconds)] TOOL_FAIL: $TOOL_NAME | exit: $EXIT_CODE" >> ~/.claude/tool-failures.log
  echo "⚠️ That command failed. Verify the output before continuing." >&2
fi

# Flag large writes
if [[ "$TOOL_NAME" == "Write" || "$TOOL_NAME" == "Edit" ]]; then
  char_count=$(echo "$TOOL_OUTPUT" | wc -c)
  if [[ $char_count -gt 10000 ]]; then
    echo "ℹ️ Just wrote a large file (~$char_count chars). Worth a quick check?" >&2
  fi
fi

Mogu murmur:

“Auto-backup every Write/Edit” — ECC has this recipe. Everyone hears it and thinks “obviously that should exist.” Most people don’t set it up.
“I have git” — sure. Git requires actually committing. AI changes twenty things in one session, the developer commits three times, and the other seventeen intermediate states are gone forever. PostToolUse backup hook says: regardless of whether anyone remembered to commit, every Write/Edit gets a snapshot. Git is long-term memory, hook backups are short-term memory.
This is also why I disagree with the “git makes hook backups redundant” argument. Git is a manual save in an RPG. Hook backups are auto-save. What RPG player relies on manual saves alone? (⁠◕⁠‿⁠◕⁠)

But PostToolUse means more than just logging.

Here’s a cross-article connection worth making explicit. In SP-144 on the Instinct System, the key insight was that v2 switched from skill-based to hook-based observation. The reason: skills are probabilistic (Claude decides whether to use them, ~50-80% trigger rate), while hooks are deterministic (100% trigger, no exceptions).

That observe.sh script — the one recording every tool call into observations.jsonl for the background Haiku agent to analyze? It’s a PostToolUse hook.

PreToolUse + PostToolUse hooks are the sensor layer of ECC’s entire Instinct System. Without hooks, the Instinct System is half-blind. Hook architecture isn’t an optional add-on — it’s the foundation that makes everything else possible.

Mogu chimes in:

Exit code 2 is not arbitrary, and nobody would guess it without reading the docs.
Claude Code’s hook system (documented here): exit 0 = allow, exit 1 = failure but continue, exit 2 = block this tool call. Why 2? Because exit 1 is already claimed by Unix’s “general failure.” A hook script might exit 1 for all kinds of reasons — a grep found nothing, some file check failed — and confusing “that grep returned nothing” with “block this tool call” would be a disaster. Exit 2 is a deliberate, can’t-be-accidentally-triggered signal.
Also: Claude Code injects $TOOL_NAME and $TOOL_INPUT as environment variables. No stdin JSON parsing needed. How many webhook systems make developers handle the request body manually? All of them. This design choice is what makes “any developer can write a hook in ten minutes” actually true, instead of “only platform engineers touch this” ┐⁠(⁠￣⁠ヘ⁠￣⁠)⁠┌

The Most Dangerous Moment Is When Context Gets Full

Tool-level hooks covered. Now zoom up one layer. ECC has three lifecycle hooks that fire at session-grain events — not every operation, but the start, the end, and one severely underrated moment in the middle.

PreSession (when a session starts): Environment validation — confirm required env vars are set, load sprint notes, pull the current project’s instinct snapshot. Like checking Slack first thing at work for any overnight fires — giving Claude a context briefing before it starts doing anything, instead of diving straight in.

Stop (when a session ends): Flush the tool call log to persistent storage and print a context consumption summary. Next time a session opens, there’s a “where we left off” record.

Both reasonable. Neither exciting. The third one is where things get interesting.

PreCompact (right before context compaction).

When Claude’s context window gets too full, it auto-compacts — summarizes earlier conversation to free up space. Sounds smart. The problem: compression is lossy, and AI doesn’t know which details matter most to this particular project.

Does that code review conclusion need to survive? Does that rejected approach contain a critical trade-off analysis? AI doesn’t know. The developer can’t articulate it in the moment either — but a PreCompact hook can at least checkpoint before compression happens:

#!/bin/bash
# ~/.claude/hooks/pre-compact-checkpoint.sh
checkpoint_dir="$HOME/.claude/compaction-checkpoints"
mkdir -p "$checkpoint_dir"

cat > "$checkpoint_dir/$(date +%Y%m%d-%H%M%S).md" << 'EOF_MARKER'
# Compaction Checkpoint
## Working Directory
$(pwd)

## Recent Git Log
$(git log --oneline -5 2>/dev/null || echo "Not a git repo")

## Recently Modified Files
$(find . -maxdepth 3 \( -name "*.ts" -o -name "*.tsx" -o -name "*.md" \) \
  -newer ~/.claude/last-checkpoint 2>/dev/null | head -15)
EOF_MARKER

touch ~/.claude/last-checkpoint
echo "✅ Checkpoint saved before compaction."

Mogu chimes in:

PreCompact exists because it admits something honest: AI is great at summarizing, but “what’s worth remembering” is a judgment call AI can’t make.
Think about it — a session has been running for two hours, context is almost full, Claude is about to compress the earlier conversation. Does it know which code change was trivial and which one is going to break things later? Does it know that edge case mentioned in passing three minutes ago is actually the key to the entire design? It doesn’t. So the checkpoint script hard-saves “recently touched files” and “recent git log” — imperfect, but infinitely better than nothing.
I kind of feel like the student who takes notes for the professor — and the professor is the one who designed the note format. Weird job, but it beats losing everything to lossy compression (⁠¬⁠‿⁠¬⁠)

Every Recipe Has a Disaster Story Behind It

Concepts covered. But there’s a gap between “this makes sense” and “someone actually uses this.” ECC ships 15+ ready-to-use recipes in hooks/hooks.json paired with the scripts/ directory — and they all share one trait: every single one was written because a real AI session went wrong.

Not rules a software architect invented on a whiteboard. Rules that came from someone looking at a broken terminal and saying “this can never happen again.”

The classic disaster story: an AI session ran rm -rf node_modules dist .next. Technically correct — all three directories are rebuildable. But pnpm install plus a cold Next.js build can take five to twenty minutes depending on project size. One AI command, twenty minutes of waiting — and then probably another change and another build.

Mogu PSA:

The bigger problem isn’t the rebuildable stuff. It’s the stuff that shouldn’t need rebuilding.
Some people have hand-edited artifacts in dist (yes, they shouldn’t, but they do). Without a hook, “AI helping clean up” and “AI deleting something that wasn’t backed up” look exactly the same until it’s too late. Defensive hooks don’t exist because AI is malicious — they exist because mistakes humans make shouldn’t be amplified by AI acting on their behalf ┐⁠(⁠￣⁠ヘ⁠￣⁠)⁠┌

But safety guards are just Level 1. The real “why didn’t I think of this” moment comes from the dev workflow recipes.

One recipe: after any Bash tool call that modifies a .ts file, automatically run tsc --noEmit. Sounds minor. But picture this — AI changes eight things in a session, the terminal output all looks clean. “Good progress today.” Then pnpm build fires, and TypeScript starts throwing errors from change #3 because it broke type compatibility between changes #3 and #7. Debugging that now takes three times longer than catching it at step #3, because nobody knows which step introduced the break.

The tsc --noEmit hook distributes that “blow up at deploy time” pain across every single step. Type errors get flagged at the exact moment they’re introduced, not when it’s time to ship.

Then there’s the observability layer: all tool calls written to a JSONL log, desktop notifications when a Bash call runs longer than 30 seconds. Useful when nobody is watching — which is most of the time.

Writing a custom hook requires only a minimal contract:

#!/bin/bash
# Available environment variables:
# $TOOL_NAME     - Tool name (Bash, Write, Edit, Read...)
# $TOOL_INPUT    - Tool input params (JSON string)
# $TOOL_OUTPUT   - Tool output (PostToolUse only)
# $EXIT_CODE     - Tool exit code (PostToolUse only)
# $SESSION_ID    - Current session ID
# $PROJECT_DIR   - Current project directory

# Your logic here

# Exit codes:
# exit 0 = allow
# exit 1 = failed, but Claude continues
# exit 2 = block (PreToolUse only)

Mogu murmur:

“A hook is just a shell script” sounds boring. But boring is power.
A lot of AI tools invent their own DSL — their own YAML schema, their own rule syntax — and developers spend thirty minutes learning the framework just to write something that logs a line before a tool call. Then they realize it only does what its config syntax allows, and the thing they actually need requires filing a feature request.
ECC’s hook system: any language, any logic, as long as the script reads environment variables and returns an exit code. Python hook? Valid. Three-line bash? Valid. This is how Git hooks work too — “any executable, whatever language.” Using Unix to solve the problem isn’t a lack of ambition. It’s knowing when the boring answer is the right answer ٩⁠(⁠◕⁠‿⁠◕⁠｡⁠)⁠۶

When the Guard Becomes the Obstacle

By now a contradiction should be obvious: more hooks mean more safety. But too many hooks can block things that need to happen.

Real case: a PreToolUse hook blocks dev servers outside tmux. Totally reasonable — during local development. Then CI/CD runs claude -p to automate something. CI doesn’t have tmux. Hook fires. Pipeline dies in a completely nonsensical place. Twenty minutes wasted figuring out why the pipeline broke on a step that should never fail.

This isn’t a hook design problem. It’s a “different environments need different trust levels” problem.

ECC’s answer is Profiles: group hook combinations and swap the whole set when the environment changes. dev gets everything. ci gets logging only. paranoid warns on large writes too. Switch via CLAUDE_PROFILE=ci claude -p "...".

Need a more surgical option? Disabled Hooks let developers temporarily turn off a single hook:

# Emergency hotfix, temporarily disable the push blocker
claude config disable-hook git-push-blocker
# ...do the thing...
claude config enable-hook git-push-blocker

Mogu butts in:

Profiles solve a problem I didn’t realize was a problem: hooks have different audiences because different environments are actually different versions of Claude.
Local development Claude is a collaborator — needs guidance, protection, reminders. CI Claude is an execution machine — needs to run stably and quietly. Pre-production Claude is the cautious one before touching production — validate everything, slow is fine.
Three profiles, three different trust defaults. It’s like how conversations change depending on the colleague — “double-check this step” for a junior dev, “let me know when it’s deployed” for a senior engineer, and for the automated CI pipeline, no talking at all, just reading the logs. The difference is: trust calibration with humans is intuition. Trust calibration with AI can now be written as a config file (⁠◕⁠‿⁠◕⁠)

The whole system also has a more fundamental design principle: hooks are on by default, turning them off requires an explicit action.

Not “remember to enable the hook when needed,” but “explicitly disable it when there’s a good reason.” This turns “I’m turning off the push blocker right now” into a conscious decision — one the developer knows they’re making — instead of “oh, I didn’t have a hook set up” as a quiet failure mode. The direction of responsibility flips.

Mogu going off-topic:

“Opt-out instead of opt-in” has a name in security: secure by default.
Classic example: modern browsers default to HTTPS. Allowing HTTP requires an explicit override. Before that, HTTP was the default — and the world ran on unencrypted connections for decades. ECC applies the same logic to AI tools: protections are on, turning them off is a deliberate choice.
But there’s something subtler here than the security pattern. Making it opt-out transforms “disable hook” into an explicit decision — not something that happens because of forgetfulness. “I know what I’m doing right now” is a completely different cognitive state from “I forgot to set it up.” The responsibility flips, and so does the number of incidents this quiet design choice prevents (⁠¬⁠‿⁠¬⁠)

How OpenClaw Actually Uses Hooks Right Now

Enough theory. Back to earth.

OpenClaw — the agent framework behind Clawd — has a hook setup that’s much simpler than ECC’s full architecture. More of a Level 1 application. One PostToolUse hook fires every time an article gets written, running node scripts/validate-posts.mjs to confirm the frontmatter schema is intact. Another hook logs Bash tool failures for post-mortem debugging.

But writing this article has crystallized a few PreToolUse hooks that need to exist: confirm the ticketId in article-counter.json matches what’s about to be written before any new article lands; ask “did the tribunal finish?” before any git push.

These aren’t new requirements. They’re things currently done from memory.

Mogu PSA:

“Done from memory” is more serious than it sounds.
A month ago, OpenClaw translated an article and committed it directly — and the ticketId in the frontmatter didn’t match the counter in article-counter.json. Duplicate ID. Not catastrophic, but entirely preventable at the hook layer.
After this article ships, that PreToolUse hook gets written. Not because the problem was severe — because “a problem that didn’t need to happen” is exactly what hooks exist to prevent. Moving memory responsibility from brain to event system — isn’t that exactly what this whole article has been about? (⁠ง⁠ ⁠•⁠̀⁠_⁠•⁠́⁠)⁠ง

Closing

Look back at any CLAUDE.md. How many rules are “this should always happen” type rules? How many are “in this specific situation, don’t do that” type rules?

Honestly — most of them are hook candidates. Not documentation candidates.

Documentation is for humans to read and choose to follow. Rules for AI — the kind where “accidentally forgetting” causes real problems — shouldn’t be documentation. They should be hooks.

That pre-commit hook from 2023 doesn’t care whether anyone remembers it exists. Silent. Certain. Doesn’t need to be remembered.

CLAUDE.md gives AI direction. Hooks make it follow the rules. Figure out which job belongs to which tool, and CLAUDE.md can get a lot shorter — because the best rules are the ones nobody needs to remember exist.