45 Comments
User's avatar
⭠ Return to thread
Nathan's avatar

Lots of good points, but I think you’re overly cynical on the utility of older, cheaper models. For a lot of things, they’ll be really worthwhile, as you noted, the Haiku trick is making Claude Code a lot more cost efficient… not every task will need a GPT-5 level “worker”, just a GPT-5 level planner…

Devin was doing usage based pricing last time I checked. Those guys don’t shy away from swiping your card.

I think you’re right that burning cash to try and grow will blow up in most company’s faces, but that’s the same as it ever was. Many in betweeners will eventually raise their rates and, while the Claude Code 24/7 background looper types who ruin the party for everyone will hem and haw, most business users especially will shrug and move on. I saw this happen with Docker. They were and maybe still are troubled, but they clamped down hard eventually on all the free stuff they were giving away, and everyone moved on.

Expand full comment
Ethan Ding's avatar

fair, altho for the consumer market of flat subscriptions, there seems to be no way through

Expand full comment
Nathan's avatar

Yes- I believe we're currently in a relatively egalitarian landscape for access to the tools. Once the belt has to tighten, I think we will see a landscape where a lot of people and businesses will get caught in having to pay ever-increasing rates for AIs to stay on top. It might create a weird stratification at a societal level as "I can afford to get my kids $20K a year GPT-7 subscriptions" becomes the new "I can afford to pay for Harvard".

Expand full comment
Tommy's avatar

The last point is really interesting. Lets hope access to intelligence is not limited (the stretch of capabilitites, opoen source to closed source) by a mile.

Expand full comment
intermediation's avatar

I prefer consistent results over “glimpses of genius” for many tasks. I did some promptfoo tuning for a medical video conference “write-up assistant” on AWS Bedrock running Claude. It would freak out the doctors if Sonnet suddenly spat out a PhD-level text from its training set 🤣 My recommendation was only to use Haiku 2.5; it was better at not being a professor of medicine at random.

So now i try to use the weaker models and to improve my prompts/workflows. In reality, time and motion studies show that LLMs feel good when they trick you with a great answer, only to find that that slot machine doesn't pay out 80% of the time. It is better to optimise for what saves your time. The older models are probably good enough for 90% of work, as long as that work isn't just “stealing IP” that the models memorised.

Expand full comment
Ben's avatar

IMO he's not necessarily making a point base on the utility of those older models, but on folks' reluctance to use them when there's a (relatively) smarter one available. Intelligent down-switching to "dumber" models seems workable to me, but is high-risk in a commoditized product environment (especially when some of the players have wallets that'd make King Midas blush)

Expand full comment
Nathan's avatar

Perplexity seems to downswitch sometimes now in some form and it's painful, so I have some inclination for the argument.

Expand full comment
Earl Lee's avatar

This problem only exists if the tool allows the user to select the model they're using. But who actually wants to go through the hassle of picking the right model for the task in the first place? I can see most tools moving more towards how Claude Code works where you don't select the model directly. You can still pay up to have the best model in the repertoire but it doesn't have to mean you can always force the tool to use that best model.

Another observation: I'm a ChatGPT Pro subscriber who used to spam Deep Research and o1 Pro but honestly, recently, I've found myself valuing speed to response more often these days and preferring cheaper but faster models like 4o.

Expand full comment
Claude L. Johnson Jr.'s avatar

I agree that people will use weaker models for simpler use cases. Cost will become a multi-variate equation based on which models went into generating the output. Still, the ultimate end game will be the foundation model providers partnering (e.g. being bought out by OR merging with) the neoclouds. We saw this with all of the dark fiber during the Dot Com Bust. The vertical integration is the only way other than the token cost short squeeze. If the model providers decide to raise their prices, they're betting that their portfolio of models can make them a one-stop shop for developers. If they merge with the infrastructure providers, model use discounts can be built into the Pro, Max and Enterprise contracts.

Expand full comment