|
|
|
|
|
by janalsncm
101 days ago
|
|
I don’t disagree, but it’s worth having a look at the changes the LLM did apply. https://github.com/karpathy/autoresearch/blob/master/progres... My opinion is you’d have to go pretty far down the x axis to get to anything that’s not things like tinkering with bs, lr, or positional encodings. There are so many hyperparameter knobs already exposed that duplicating layers is unlikely to be proposed for a long time. I also just noticed that the last change it applied was changing the random seed. Lol. |
|