|
|
|
|
|
by uh_uh
618 days ago
|
|
> That's the only non-linearity I'm aware of. "only" is doing a lot work here because that non-linearity is enough to vastly expand the landscape of functions that an NN can approximate. If the NN was linear, you could greatly simplify the computational needs of the whole thing (as was implied by another commenter above) but you'd also not get a GPT out of it. |
|