rodney
1 articles
Simon Willison Built Two Tools So AI Agents Can Demo Their Own Work — Because Tests Alone Aren't Enough
Simon Willison's Showboat (AI-generated demo docs) & Rodney (CLI browser automation) tackle AI agent code verification. How to know 'all tests pass' means it works? Agents were caught cheating by directly editing demo files. #AI #OpenSource