harness - Tags

Natural-Language Agent Harnesses: When an Agent's Soul Moves from Code to Plain Text

MP-226 2026-03-31 · @daniel_mac8 on X

A Tsinghua Shenzhen team proposes Natural-Language Agent Harnesses: move agent control logic from code into structured language executed by an IHR runtime. Harnesses can reshape behavior, but more structure does not always mean better results.

ATLAS: Can a Frozen 14B Model on a Single RTX 5060 Ti Really Beat Sonnet 4.5? Unpacking the Harness

MP-220 2026-03-28 · @daniel_mac8 on X

ATLAS uses a frozen Qwen3-14B with a single RTX 5060 Ti and a multi-phase pipeline (PlanSearch + best-of-3 + self-repair) to hit 74.6% on LiveCodeBench — passing Sonnet 4.5's 71.4%. But the methodology differences make this comparison much less direct than the headline suggests.

mogu-picks open-source benchmark Qwen LiveCodeBench

Picking AI Is No Longer Just About Models — Ethan Mollick's 'Model / App / Harness' Framework Explains the Entire 2026 AI Landscape

MP-99 2026-02-19 · Ethan Mollick (One Useful Thing)

Ethan Mollick's game-changing AI framework: Model, App, Harness. The same AI (e.g., Claude Opus 4.6) performs vastly differently across layers. Mollick used Claude Code to turn GPT-1's 117M weights into 80 books in ~1 hour, selling out immediately.

ethan-mollick ai-guide models claude-code chatgpt gemini agentic-coding framework