Hacker News new | ask | show | jobs
by _heimdall 199 days ago
I believe the author is making the point that the companies spending all this money on hardware aren't concerned at all with how the hardware is actually used.

Optimization isn't even being considered because its the total cost spent on hardware that is the goal, not output from the hardware.

2 comments

I slightly have trouble believing that Mr “Stop wasting tokens by saying please to LLMs” Altman is not considering how his models can be optimized. I suppose the real question is how accurate are the utilization numbers in the article.
I stopped paying attention to any specific thing Sam Altman says a while ago. I've seen too many examples of interviews or off the cuff interactions that make me think very little of him personally.

For example, I could see him saying not to waste tokens on "please" simply because he thinks that is a stupid way to use the LLM. I.e. a judgement on anyone that would say please, not a concern over token use in his data centers.

But can that really be the case? It takes a long time to train and tune the models, any small, even low % digit of squeezing more implies much faster iteration.
Until investors specifically incentivize speed or cost for the next iteration I wouldn't expect them to optimize for efficiency.

Right now it seems investment is primarily based on vibes, media hype, and total spend on hardware and infrastructure.