Hacker News new | ask | show | jobs
by jongjong 895 days ago
As someone who has written an ANN from scratch and hasn't used TensorFlow, I still find this description confusing.

I asked ChatGPT to explain how to modify a basic ANN to implement self-attention without using the terms Matrix or Vector and it gave me a really simple explanation. Though I haven't tried to implement it yet.

I prefer to think of everything in terms of nodes, weights and layers. Matrices and vectors just makes it harder to relate to what's happening in the ANN.

The way I'm used to writing ANNs, each input node is a scalar but the feed forward algorithm looks like vector-matrix multiplication since you multiply all the input nodes by the weights then sum them up... Anyway, I feel like I'm approaching these descriptions with the wrong mindset. Maybe I lack the necessary background.