|
|
|
|
|
by lukebechtel
331 days ago
|
|
It's meant to replace the BPE tokenizer piece, so it isn't a full Language Model by itself. In fact in Gu's blog post (linked in a post below) it's mentioned that they created a Mamba model that used this in place of the tokenizer. |
|