A senior colleague found a model eighty percent better than anything she'd used all year. Better at what, though? And in your own work, how would you know if the next version quietly lost it? You work through the same chat box daily with almost no view of what the model behind it can do.