Episode 15 - Inside the Model Spec
By OpenAI
Categories: AI, Product
Summary
OpenAI's 100-page Model Spec is not a perfect rulebook but a living document explaining intended behavior across multiple dimensions—it's continuously updated based on real-world deployment data, with hard rules for safety and flexible defaults for tone/style that users can override.
Key Takeaways
- The Model Spec is a ~100-page document structured with high-level goals, granular policies, and dozens of borderline case examples to clarify decision boundaries between competing principles like honesty vs. politeness.
- Critical distinction: The spec is designed primarily for human understanding (employees, users, developers, policymakers), not as an implementation artifact to train models—wording changes never prioritize teaching the model over clarity.
- Model behavior has hard rules (non-negotiable safety constraints) and soft defaults (tone, style, personality) that users can steer differently—this two-tier structure balances safe defaults with user agency.
- The Spec acknowledges it's incomplete—it captures major decisions and intentions but excludes product features (memory), usage policy enforcement, and implementation details, making it a principles document not a system blueprint.
- Real-world deployment drives iteration: teams measure alignment gaps between spec and actual model behavior, gather user feedback on what works/doesn't, then update both the spec and training to close the gap.
Topics
- Model Spec Framework
- AI Safety Alignment
- Model Behavior Guidelines
- LLM Steering and Steerability
- AI Product Policy Design
Transcript Excerpt
Hello, I'm Andrew Maine and this is the Open Eye podcast. Today we are joined by Jason Wolf, a researcher on the alignment team to discuss the model spec, how it shapes model behavior, and why it's important for anyone building or using AI tools to understand the the spec often leads where our models actually are today. At this point, you know, models are pretty good at like kind of going out and finding new interesting examples. Models should think through hard problems. Don't start with the an...