| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jiggawatts 68 days ago

> retain that capbibilty forever

Not really. The base training data cutoff will quickly render models useless as they fail to keep up with developments.

Translating some Farsi news articles about the war was hilarious, Gemini Pro got into a panic. ChatGPT either accused me of spreading fake news, or assumed this was some sort of fantasy scenario.

3 comments

jeremyjh 67 days ago

Karpathy - and others - consider the pre-training knowledge as much a liability as an asset. If we could just retain the emergent reasoning and language capability without the hazy recollections the models would likely be stronger.

link

m00x 68 days ago

That's GPT4 thinking. New models use tools to look at current events or latest versions, and rely very little on weight knowledge.

link

zozbot234 68 days ago

You can pull new information into the context via RAG, but that is expensive and only gives very shallow understanding compared to retraining.

link

nl 68 days ago

Not really.

For coding I care mostly about reasoning ability which is uncorrelated with cut off

link