|
|
|
|
|
by dartos
900 days ago
|
|
Transformers can be considered a kind of neural network. It’s mainly fancy math. With tools like PyTorch or tensorflow, you use python to describe a graph of computations which gets compiled down into optimized instructions. There are some examples of people making transformers and other NN architectures in about 100 lines of code. I’d google for those to see what these things look like in code. The training loop, data, and resulting weights are where the magic is. The code is disappointingly simple. |
|
Although it feels a little similar to some of the basic reactions that go to make up DNA: start with simple units that work together to form something much more complex.
(apologies for poor metaphors, I'm still trying to grasp some of the concepts involved with this)