Inside How Anthropic Is Building the Next Claude | Alex Albert

By Peter Yang

Categories: Product, Startup

Summary

Anthropic treats AI model development like product management, with research PMs defining specific capability requirements (coding, knowledge work, spreadsheets) before training begins. They use Claude itself to analyze user feedback at scale, creating a self-improving loop where customer insights directly inform the next model iteration.

Key Takeaways

Research PMs spec out exact model capabilities before training (coding, knowledge work, spreadsheet proficiency), then validate intuitions during the training process since outcomes aren't fully predictable until training completes.
Models are treated as products exposed across multiple surfaces (API, Cloud Code, Claude UI), meaning different prompts and use cases require thinking through the entire product experience, not just the model itself.
Use Claude to analyze user feedback at scale—clustering themes, identifying top problems, and creating synthetic test cases to diagnose model gaps without drowning in raw feedback data.
Research PMs talk to internal teams, API customers, and direct Claude users to identify where models excel and fall short, then feed that intelligence back into the next model training cycle.
Every new model iteration focuses on both new capabilities (emerging areas like spreadsheet work) and fixing specific weaknesses from the previous version through targeted training interventions.

Related topics

Transcript Excerpt

I was definitely the first prompt engineer at Anthropic. I might have been the first in the world. We treat the model as if it's a product to some degree. With every new model, we are specking out exactly what do we want this model to be good at. When the agent isn't running a task for you, or maybe it's in the background, it's actually going through its memories, finding things that [music] might contradict, pruning them, cleaning them up. This concept of dreaming. Engine time isn't as much of a one-way door these days. If it's something that's not a one-way door, that's effectively free at this point. Agents that are running on task for a long amount of time and they're having to make a lot [music] of judgment decisions. The questions of what its character is and what it cares about are …

More from Peter Yang