Hacker News new | ask | show | jobs
by rollinDyno 798 days ago
That weighted context is the 12228 dimensional vector, no?

I suppose that when you each element in the vector weighs 16 bits then the space is immense and capable to have a novel in a point.

2 comments

GPT-4 is configurable up to 96 layers, each running their own embeddings. I think it was a business choice to afford the compute while they scale.
But if I understand correctly, GPT-4 reduces that to a 1536-dimensional vector. Roughly 1/8th. It's counterintuitive to me.