Hacker News new | ask | show | jobs
by stormfather 1189 days ago
Imagine you think of 2 numbers to describe a basketball. You give a number for weight (1), and redness (0.7). Now, a basketball can be described by those 2 numbers, (1, 0.7). That is an embedding of a basketball in 2d space. In that coordinate system a baseball would be less heavy and less red, so maybe you would embed it as (0.2, 0.2).

basketball ==> (1.0, 0.7) # heavier, redder baseball ==> (0.2, 0.2) # less heavy, less red

When an LLM (large language model) is fed a word, it transforms that word into a vector in n-dimensional space. For example:

basketball -> [0.5, 0.3, 0.6, ... , 0.9] # Here the embedding is many, many numbers

It does this because computers process numbers not words. These numbers all represent some property of the word/concept basketball in a way that makes sense to the model. It learns to do this during it's training, and the humans that train these models can only guess what the embedding mappings it's learning actually represent. This is the first step of what a LLM does when it processes text.