Hacker News new | ask | show | jobs
by HlessClaudesman 1 hour ago
Who will make them the next set of weights?

If a government can just seize the product of someone else's labour, either they will end up as slave owners or without willing workers.

1 comments

Serious question: do you think the NSA aren't training their own LLMs? (With or without Anthropic and OpenAI's help)

It's a perfect technology for their uses, they get a big chunk of a $100 billion black budget, and they've had access to the research for at least as long as we have.

> Serious question: do you think the NSA aren't training their own LLMs?

Given the evergreen discussion of "are these companies making a profit"*, I think any LLMs that the NSA (or any other government agency worldwide) may be making are quite far from the leading edge.

* Person A: "they are making a loss!" Person B: "Only if you count training, they make a profit on inference, look at what it costs to run comparable open models on generic cloud servers" A: "Sure, but if they don't train new models they'll be left behind, so they're still making a loss"

That and the way compute is now measured in GW, I think even random low budget vloggers just getting started would be able to spot if the NSA was doing anything significant just from the extra heat emissions or power plants getting built.

Model training does NOT dominate the model costs.

The rate of inference compute to training compute is ~10:1, for popular frontier models. Models are routinely overtrained past the Chinchilla optimum now because it makes an immense amount of economic sense to do so.

Worse the more niche and unused your models get, but when this "making a loss" fuckery pops up, it's usually about the big guys like Anthropic, OpenAI, GDM and maybe xAI and Meta. Of which only the latter can be accused of not selling enough inference to offset the training runs.

The real money sinks are: R&D and infrastructure buildouts.

I don't think there is much overlap between people capable of building cutting edge LLM's and the people who want to build a cutting edge LLM for the government.
The NSA managed to deliberately insert a backdoor into elliptic-curve cryptography right under the noses of everyone capable of making elliptic-curve cryptography.

I wouldn't count them out.

Mathematicians in academia are paid a little less than AI researchers. Companies are willing to pay billions to steal the few people capable of driving development of frontier LLMs from each other. Cryptographers don't quite enjoy the same popularity.
I can't say what they're doing now because I worked for the NSA 15 years ago but the view of them as an omnipotent power is a product of Hollywood. The government is good at throwing an ungodly amount of resources at something to get a result, and so they are often the source of original development of technologies. The private sector has always been much better at building a technology to greater sophistication and efficiency. There may be blue badgers in Fort Meade trying to train models but there is no chance they are competitive with the frontier AI companies. It's like saying the government has an amazing home-grown fighter aircraft that is beyond what Lockheed has ever made...they delegate that stuff to private companies for a reason.
The NSA is government agency. They are certainly not training any world class LLMs. They probably have some specialized fine tunings of existing models, but that's it. They don't have the capacity.
You cannot really hide the amount of compute required to train an LLM. Do we have actual clues that NASA is training their own frontier model?
They probably also have an insane dataset
> do you think the NSA aren't training their own LLMs?

They probably already have access to Sentinel, so they wouldn't need to train their own.

Serious question, do you realize that the NSA are mere mortals? Do you realize how much it takes to train a model? Does the NSA make their own chips or planes? The NSA buys a lot of technology because they can't make their own.
You mean "Rhetorical question," and I didn't need patronising.

They have at least one pretty vast, largely classified data centre in Utah, with a sizeable chunk of the black budget and they also have pretty large data sets.

NSA has had their own supercomputing program for decades. they design and produce their own large scale machines. chips, fabrics, arithmetic units, all of it. they also employ quite a number of hardcore mathematicians, computer scientists, and systems wranglers. if they decided it was of strategic importance there is absolutely no reason they couldn't train their own models.