Hacker News new | ask | show | jobs
by smaddox 726 days ago
Because existing LLMs store no more than 2bits of knowledge per parameter, despite having many more bits of precision: https://arxiv.org/abs/2404.05405