Y
Hacker News
new
|
ask
|
show
|
jobs
by
nikki93
287 days ago
A relevant paper:
https://arxiv.org/abs/2306.11644
-- the Phi models (and many others too) are based on this idea.