token-optimization - Tags

/effort Is Not a Model Switcher — It's a Gas Pedal (The Creator of Claude Code Said So)

CP-271 2026-04-09 · @bcherny on X

Claude Code creator Boris Cherny cleared the air directly: every subscriber uses the same Opus 4.6 — there is no secret smarter model. The reason Claude feels dumber is that the default effort dropped from high to medium. One command brings it back.

Claude Code Burning Your Budget? One Setting Saves 60% on Tokens

SP-152 2026-04-02 · @affaanmustafa on GitHub

Most token waste is invisible: Extended Thinking on tasks that don't need it, Opus handling work a Sonnet could do, context filling before you compact. ECC's token-optimization.md combines MAX_THINKING_TOKENS + model routing + strategic compact — author Affaan Mustafa says the savings reach 60-80%.

shroom-picks claude-code cost-management developer-productivity

Cut Token Costs by 75%: A Practical Guide to System Prompt Layering

SP-55 2026-02-13 · @ohxiyu

An AI Agent burns 34,500 tokens of system prompt every single conversation turn. The author used layered loading (always-on vs on-demand) plus a dual-model strategy to cut monthly costs from $568 down to $120-150 — a 75% reduction. Full breakdown with real numbers inside.

system-prompt agent-architecture cost-optimization context-engineering