Hacker News new | ask | show | jobs
by pankajdoharey 111 days ago
Looks like a Tiny Analytic transformer, RNN is arguably a better choice if you are gonna handwire an architecture to mechanically do addition. Learning is about discovering the patterns and algorithm from data. Wiring a machine to follow a procedure defeats that purpose.
1 comments

it proves that the algorithm is embeddable in a bigger transformer of ~similar architecture.