Hacker News new | ask | show | jobs
by mlucy 2721 days ago
It's really difficult to overstate how important embeddings are going to be for ML.

Word embeddings have already transformed NLP. Most people I know, when they sit down to work on an NLP task, the first thing they do is use an off-the-shelf library to turn it into a sequence of embedded tokens. They don't even think about it; it's just the natural first step, because it makes everything so much easier.

In the last couple years, embeddings for other data types (images, whole sentences, audio, etc.) have started to enter mainstream practice too. You can get near-state-of-the-art image classification with a pretrained image embedding, a few thousand examples, and a logistic regression trained on your laptop CPU. It's astonishing.

(Note: I work on https://www.basilica.ai , an embeddings-as-a-service company, so I'm definitely a little bit biased.)

2 comments

What I find particularly neat are the non-Euclidean embeddings, such as hyperbolic spaces to generate hierarchies.
Do you know of any good resources for learning about such things for someone with a cursory of Word2Vec ?
"Implementing Poincaré Embeddings" (hyperbolic embeddings implemented in Gensim):

https://rare-technologies.com/implementing-poincare-embeddin...

just like minkzilla I would love to read more about this
It's an exciting time for sure. To the layman this feels like the first real progress we've had towards AI since the 70s. It seems like the field kind of wandered off into the realms of pure mathematics for a few decades with little tangible progress, but now we're getting stories every few weeks about how computers can recognize objects in pictures or compose new music or whatever.