|
|
|
|
|
by rsweeney21
871 days ago
|
|
It's still strange to me to work in a field of computer science where we say things like "we're not exactly sure how these numbers (hyper parameters) affect the result, so just try a bunch of different values and see which one works best." |
|
Isn't it the same for anything that uses a Monte Carlo simulation to find a value? At times you'll end up on a local maxima (instead of the best/correct) answer, but it works.
We cannot solve something used a closed formula so we just do a billion (or whatever) random samplings and find what we're after.
I'm not saying it's the same for LLMs but "trying a bunch of different values and see which one works best" is something we do a lot.