| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sdenton4 281 days ago
	The spare autoencoder work is /exactly/ premised on the kind of near-orthogonality that this article talks about. It's called the 'superposition hypothesis' originally: https://transformer-circuits.pub/2022/toy_model/index.html The SAE's job is to try to pull apart the sparse nearly-orthogonal 'concepts' from a given embedding vector, by decomposing the dense vector into a sparsely activation over-complete basis. They tend to find that this works well, and even allows matching embedding spaces between different LLMs efficiently.

1 comments

gpjanik 280 days ago

Agreed, but that's not in the C dimension of a first-layer embedding of a single token though, it's across the whole model and that's what I said in the comment above.