Hacker News new | ask | show | jobs
by ninjahatori 815 days ago
On a side note: working over longer contexts also reminds me of MemGPT(https://github.com/cpacker/MemGPT) I think a similar concept can be applied to Mamba architecture models too.