| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Arcuru 551 days ago
	I'm not sure I am understanding where exactly this slots in, but isn't this an embedding model? Shouldn't they be comparing it to a service like Voyage AI? - https://docs.voyageai.com/docs/embeddings

2 comments

janalsncm 551 days ago

You’re comparing SaaS to open weights. A SaaS will never compete on the flexibility of adding a classification head to BERT (where the gradients flow all the way back), training it, knowledge transferring to a similar domain, distilling it down, pruning layers, fine-tuning some more, etc. which is a common ML workflow.

link

spott 551 days ago

Embedding models are frequently based on Bert style models, but Bert models can be finetuned to do a lot more than just embeddings.

So an embedding focused finetune of modern Bert should be compared to something like voyageai, but not modern Bert itself.

link

KTibow 551 days ago

What are the people who keep downloading Bert doing then? Are they the minority who directly use it for embeddings?

link

janalsncm 551 days ago

They are probably fine tuning on their own particular downstream tasks, either for embeddings or as a component of a larger model.

link

spott 551 days ago

I’m honestly not sure why Bert-based-uncased is so popular… the model isn’t that useful on its own. From their huggingface page:

> You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions of a task that interests you.

> Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering. For tasks such as text generation you should look at model like GPT2.

link

strangecasts 551 days ago

I think this comes down to the Huggingface libraries defaulting to downloading the model from HF if they cannot locate the weights - so "make your own text classifier" tutorial notebooks default to bert-based-uncased as a "standard" pretrained encoder you can put a classification head on top of and finetune, and in turn people run them in Google Colab and just download another copy of the weights on startup, which counts towards the total

link

metanonsense 551 days ago

I am out of the game for a year or so (and was never completely in the game), but back then BERT was the basis for lots of interesting applications. The original Vision Transformer (ViT) was based (or at least inspired by) BERT, it was used for graph transformers, visual language understanding, etc.

link