Hacker News new | ask | show | jobs
by faramarz 787 days ago
GPT-4 is configurable up to 96 layers, each running their own embeddings. I think it was a business choice to afford the compute while they scale.