Hacker News new | ask | show | jobs
by minimaxir 2760 days ago
textgenrnn (https://github.com/minimaxir/textgenrnn) uses a simple Attention Weighted Average at the end of the model for text generation, which in my testing allows the model to learn much better.