| over the last months I've experimented with many alternatives to transformers, such as one i created github repo about: https://github.com/bggb7781-collab/lrnnsmdds Architectures I've experimented with and my personal notes as pros and cons: 1. RNNS:
Like Mamba, RWKV and my attempt above: likely the best alternative to transformers but hard to parallelize and I've personally encountered weird logical "bugs":
a. Severe bias over repeated text and the ending of the text corpus (go figure...).
b. Speed similar to transformers. Pros:
a. Very limited RAM utilization and very good ability to generalize and learn, perplexity reaches to extremely low levels (~1.05 for 2+ for GPT in comparison).
b. Relatively easy to understand as it uses backprop, feed-forward, matrices, very similar to transformers. 2. HDC:
hyperdimensional computing: for the time being mostly sci-fi... 3. SNNs:
spiking neural networks - i ended up having several vibecoded implementations in C#, F#, C. Ultimately despite novel ideas, not very succesful. May be could be succesful but
at the moment mostly failure...still potential. 4. TCN: temporal convolution networks...best case it seems. pros: after than rnn/transformers, good generalization, could be the next great reduction of resources in AI-gen! screenshot of my last tcn attempts: https://postimg.cc/R3r71PL2 - ultimately, there is a potential! |