|
|
|
|
|
by deepsquirrelnet
635 days ago
|
|
Mini-lm is a better embedding model. This model does not perform attention calculations, or use a deep learning framework after training. You won’t get the contextual benefits of transformer models in this one. It’s not meant to be a state of the art model though. I’ve put in pretty limiting constraints in order to keep dependencies, size and hardware requirements low, and speed high. Even for a word embedding model it’s quite lightweight, as those have much larger vocabularies are are typically a few gigabytes. |
|