Claude's skill-creator just leveled up — the complete guide to testing, measuring, and refining Agent Skills

Anthropic shipped major upgrades to skill-creator: no-code evals for testing skills, benchmark mode for tracking quality over time, multi-agent parallel testing, and trigger description optimization.