Hacker News new | ask | show | jobs
by StackOverlord 1031 days ago
For the first time with GPT4, OpenAI as been able to predict model progress with accuracy:

> A large focus of the GPT-4 project has been building a deep learning stack that scales predictably. The primary reason is that, for very large training runs like GPT-4, it is not feasible to do extensive model-specific tuning. We developed infrastructure and optimization that have very predictable behavior across multiple scales. To verify this scalability, we accurately predicted in advance GPT-4’s final loss on our internal codebase (not part of the training set) by extrapolating from models trained using the same methodology but using 10,000x less compute:

> Now that we can accurately predict the metric we optimize during training (loss), we’re starting to develop methodology to predict more interpretable metrics. For example, we successfully predicted the pass rate on a subset of the HumanEval dataset, extrapolating from models with 1,000x less compute:

> We believe that accurately predicting future machine learning capabilities is an important part of safety that doesn’t get nearly enough attention relative to its potential impact (though we’ve been encouraged by efforts across several institutions). We are scaling up our efforts to develop methods that provide society with better guidance about what to expect from future systems, and we hope this becomes a common goal in the field.

Source: https://openai.com/research/gpt-4

2 comments

Isn't this all based off self-attestation? There is no comprehensive audit of their research data and finances I am aware of. If I was OpenAI and blew millions of dollars training models that showed exponentially worse performance for incrementally more resources expended training the model, my next step would not be to publish about it.
There was an entire Post/Discussion on HN a few weeks ago discussing whether or not this Post from OpenAI was for:

"just trying to temper investor enthusiasm"

"trying to downplay AI threats to calm down regulators"

etc....

etc....

But it is not some 'Proof', that LLM's have reached a limit.

It is a self reported note along the lines of : "nothing to see here, we're at our limit, it's all good, stop probing us".