Hacker News new | ask | show | jobs
by ben_w 3 hours ago
> Serious question: do you think the NSA aren't training their own LLMs?

Given the evergreen discussion of "are these companies making a profit"*, I think any LLMs that the NSA (or any other government agency worldwide) may be making are quite far from the leading edge.

* Person A: "they are making a loss!" Person B: "Only if you count training, they make a profit on inference, look at what it costs to run comparable open models on generic cloud servers" A: "Sure, but if they don't train new models they'll be left behind, so they're still making a loss"

That and the way compute is now measured in GW, I think even random low budget vloggers just getting started would be able to spot if the NSA was doing anything significant just from the extra heat emissions or power plants getting built.

1 comments

Model training does NOT dominate the model costs.

The rate of inference compute to training compute is ~10:1, for popular frontier models. Models are routinely overtrained past the Chinchilla optimum now because it makes an immense amount of economic sense to do so.

Worse the more niche and unused your models get, but when this "making a loss" fuckery pops up, it's usually about the big guys like Anthropic, OpenAI, GDM and maybe xAI and Meta. Of which only the latter can be accused of not selling enough inference to offset the training runs.

The real money sinks are: R&D and infrastructure buildouts.