The Future of AI-Driven Software Engineering: Agents, Loops, and ROI
YouTube
This fireside chat at the @Scale conference features Jesse Chen from Meta and Boris Cherny from Anthropic discussing the transformative role of AI in software development. Boris, who heads Claude Code, shares his personal experience of writing 100 percent of his code through AI agents and provides insights into how large language models like Claude are shifting the bottleneck of development from writing code to generating and refining high-quality ideas. The conversation highlights the evolution from manual coding to agentic workflows and the emergence of loops which represent higher-order autonomous processes.
Beyond just writing code, the duo explores the broader implications of AI for organizational productivity and return on investment. Cherny emphasizes the importance of giving all employees access to AI tokens and creating a safe environment for experimentation. He also discusses new tools such as Claude Code for developers, Co-work for non-engineers, and automated systems for code and security review. The session concludes with a look toward the future, predicting that AI will continue to move up the abstraction ladder, eventually mastering complex systems design and providing sophisticated educational feedback to help users grow their own technical skills.
The video features a deep dive fireside chat between Meta's Jesse Chen and Anthropic's Boris Cherny regarding the current state and future trajectory of AI in software engineering. They discuss the transition from using AI as a simple assistant to employing autonomous agents and loops that can handle complex end to end tasks like code maintenance, security patching, and even project management. The discussion specifically covers the evolution of Claude models, the launch of Claude Code, and the philosophy behind maximizing return on investment through AI experimentation.
Key Takeaways
Software development is shifting from manual coding to prompting agents that write code.
Boris Cherny personally reports writing 100 percent of his code via AI since late 2023.
Loops represent the next stage of AI evolution, where agents autonomously iterate on tasks like code reviews or feedback collection.
Companies should focus on maximizing the return on investment (ROI) and upside of AI rather than just focusing on cost-cutting.
Broad access to AI across all departments, not just engineering, leads to the most innovative process improvements.
New features like Auto-mode and Dynamic Workflows allow agents to run for hours or days autonomously with high safety.
Diagram
Loading diagram...
Timestamps
00:23
IntroductionJesse Chen and Boris Cherny introduce themselves and the session goals.
01:42
Measuring AI UsageBoris discusses writing 100% of his code with AI and his token usage.
03:00
Efficiency and ROIDiscussion on why companies should focus on return rather than just token costs.
08:53
Defining LoopsExplaining the concept of loops as higher-order autonomous agent functions.
12:26
Claude Co-workHow non-engineers use agents to automate administrative and project tasks.
16:15
The Fable ModelDeep dive into the reasoning capabilities of Anthropic's advanced models.
20:06
Autonomous MaintenanceUsing AI for long-term code maintenance and refactoring architectural bottlenecks.
26:22
Preventing Developer LazinessHow Auto-mode and exploratory output styles maintain security and learning.
Target Audience
Software engineers, product managers, engineering leaders, and technology enthusiasts interested in the practical application and future trends of AI in software development life cycles.
Use Cases
-Understanding how to transition an engineering team from manual coding to AI-assisted development.
-Learning metrics for measuring the ROI and productivity gains of AI tool adoption in an organization.
-Discovering how non-engineers can use AI agents to automate project management and administrative tasks.
-Evaluating the benefits of automated code and security review products.
-Staying informed on the latest features and release roadmaps of Anthropic's Claude ecosystem.
A central theme of the discussion is the rapid movement up the abstraction ladder in software development. Just a few years ago, developers wrote every line of source code by hand. This was followed by a transition where AI agents began writing code snippets or functions. Boris explains that we are now moving into the era of loops. In this paradigm, an agent doesn't just respond to a single prompt but runs continuously to monitor codebases, review pull requests, or gather user feedback autonomously. This shift moves the bottleneck of development away from syntax and implementation toward the generation of high quality ideas and system architecture.
Measuring AI Productivity and ROI
When organizations consider adopting AI at scale, they often focus heavily on the cost per token. Boris argues that this is the wrong framing. Instead, companies should look at ROI. While a highly capable model like Claude 3.5 Sonnet or the Fable series might be more expensive than smaller models, the productivity gains can be orders of magnitude higher. For example, Anthropic has seen an 8x increase in code production per engineer. By giving tokens to everyone in the organization, including accountants and marketers, companies unlock hidden efficiencies that senior leadership might never have identified themselves. The focus should remain on increasing the return rather than decreasing the investment during this early adoption phase.
Automated Code and Security Reviews
As the volume of AI generated code increases, the downstream bottlenecks like human code reviews become more pronounced. Anthropic has addressed this by building Claude Code Review and Claude Security. These tools use high token counts and deep reasoning to find bugs and vulnerabilities that even professional penetration testers might miss. By automating the quality assurance and security layers, the entire development lifecycle accelerates. Boris notes that their internal products now catch 98 to 99 percent of bugs before a human even sees the pull request, allowing developers to focus on the high level intent of the change rather than searching for logic errors.
Practical Applications
Viewers can apply these insights by integrating AI agents into their daily workflows beyond simple chat. Engineers should experiment with loops for maintenance tasks like refactoring duplicated abstractions or fixing flaky tests. Non-engineers can utilize tools like Co-work to automate administrative overhead such as syncing spreadsheets with Slack messages or booking travel based on calendar events. Managers should implement exploratory output styles in their AI tools to help junior staff learn the codebase faster through AI generated explanations of changes. Finally, teams should explore Auto-mode to allow agents to handle long running background tasks without requiring constant human approval for every sub-command.
Frequently Asked Questions
What is the difference between an AI agent and an AI loop?
An AI agent typically performs a discrete task based on a specific prompt, such as writing a function or a unit test. A loop is a higher order function where an agent runs continuously or periodically to monitor a system, gather data, and execute multiple tasks over time without direct human intervention for each step. Loops are used for ongoing maintenance, feedback gathering, and system optimization.
How should a company handle the high cost of the most advanced AI models?
Boris Cherny recommends focusing on the upside potential rather than cost cutting. The most advanced models offer a significantly higher return on investment because they can solve harder problems and automate more complex workflows. Companies can also use an advisor model approach, where a cheaper model handles basic requests and calls out to a more expensive model only when a high level of reasoning or nuance is required.
Can AI handle large scale code maintenance and refactoring?
Yes, by using loops and dynamic workflows, AI can now analyze entire codebases to identify architectural weaknesses, duplicated code, or outdated libraries. Developers can prompt the agent to improve the architecture or unify abstractions, and the agent will autonomously generate pull requests for these improvements. This effectively solves the long term maintenance bottleneck that plagues many legacy software projects.
AI-Powered Software DevelopmentMeasuring ROI in AI AdoptionAgentic Workflows and LoopsAutonomous Code and Security ReviewThe Future of Human-AI Collaboration