|
|
|
|
|
by osanseviero
582 days ago
|
|
Yes, they are still used - Encoder based models have much faster inference (are auto-regressive) and are smaller. They are great for applications where speed and efficiency are key.
- Most embedding models are BERT-based (see MTEB leaderboard). So widely used for retrieval.
- They are also used to filter data for pre-training decoder models. The Llama 3 authors used a quality classifier (DistilRoberta) to generate quality scores for documents. Something similar is done for FineWeb Edu |
|