| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by danielmarkbruce 460 days ago
	The LLM embeddings for a token cover much more than semantics. There is a reason a single token embedding dimension is so large. You are conflating the embedding layer in an LLM and an embedding model for semantic search.

1 comments

heyjamesknight 458 days ago

I don't think we're using the term semantic in the same way. I mean "relating to meaning in language."

link

danielmarkbruce 458 days ago

The embedding layer in an llm deals with much more than the meaning. It has to capture syntax, grammar, morphology, style and sentiment cues, phonetic and orthographic relationships and 500 other things that humans can't even reason about but exist in words combinations.

link

heyjamesknight 454 days ago

I'll give you that. I was including those in "semantic space," but the distinction is fair.

My original point still stands: the space you've described cannot capture a full image of human cognition.

link