| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by categoricalrift 101 days ago
	How about the very last "Kept Improvement" in the plot? It's titled "random seed 42 -> 137". I do think this project is quite conceptually interesting, but the model literally choosing a different random seed to achieve lower loss feels pretty far removed from the flowery sci-fi writing at the top of the readme.

3 comments

karpathy 101 days ago

So the interesting part about this one is that when I had the model write up the results for that session:

https://github.com/karpathy/autoresearch/discussions/32

Look at its comment about this "improvement":

""" Surprising non-results:

- Changing random seed from 42→137 improved by 0.0004. Seed 7 was worse. Make of that what you will. """

So the model knows! It knows that this is a weird thing to do after the fact. I think it's silly that the model even tried and that it ran this, but some part of it also knows that it was wrong. This means that this is fixable by prompt.md

eternauta3k 101 days ago

It shows that both Karpathy and the LLM have good taste in random seeds: the answer to life, the universe and everything, and ~1/(the fine structure constant)

aix1 101 days ago

The 42 -> 137 also jumped out at me. On the face of it, the associated improvement sure does sound like overfitting to the eval set.