|
|
|
|
|
by baanist
617 days ago
|
|
I thought the whole point of neural networks was that they were good at searching through these spaces. I'm pretty sure OpenAI is pruning their models behind the scenes to reduce their costs because that's the only way they can keep reducing the cost per token. So their secret sauce at this point is whatever pruning AI they're using to whittle the large computation graphs into more cost efficient consumer products. |
|
If you were to search for billions of parameters by brute force, you literally could not do it in the lifespan of the universe.
A neural network is differentiable, meaning you can take the derivative of it. You train the parameters by taking finding gradient with respect to each parameter, and going in the opposite direction. Hence the name of the popular algorithm, gradient descent.