Hacker News new | ask | show | jobs
by PaulHoule 703 days ago
I am a huge fan of this kind of model

https://sbert.net/

for classification, clustering, and both text and image retrieval. It is often a drop-in replacement for other ways of doing things and most of their models are not crazy large so you can run them on an ordinary computer.

As for chatbots you should note they have superhuman recall in some sense but a limited ability to generalize or "reason". I have been asking Microsoft's Copilot for help with a maintenance programming project and I am amazed it it's ability to explain unusual but highly repetitive code fragments like the ones generated by the Babel compiler. Explaining what a program does by looking at the code is a difficult problem that LLMs cannot do reliably if they haven't seen very similar code before but there are many idioms that are used in application code that it has seen before and for those it is helpful.

1 comments

Yes, I have often thought if it wasn't got GPT-3, this work would have been more recognised and more powerful. However generative embeddings are still more useful for more abstract cases and do embed a different space to these similarity/contrastive kind of embeddings.