The Test Harness That Makes AI Agents Verify Themselves | Tom Karels