Hacker News new | ask | show | jobs
by DreamGen 872 days ago
When talking about memory requirements one also needs to mention the sequence length. In case of Mixtral, which supports 32000 tokens, this can be a significant chunk of the memory used.