Hacker News new | ask | show | jobs
by nwoli 1180 days ago
What we need is a RETRO style model where basically after the input you go through a small net that just fetches a desired set of weights from a server (serving data without compute is dirt cheap) and is then executed locally. We’ll get there eventually
1 comments

Can anyone explain or link some resource on why these big GPT models all don't incorporate any RETRO style? I'm only very superficially following ML developments and I was so hyped by RETRO and then none of the modern world changing models apply it.
Openai might very well be using that internally who knows how they implement things. Also emad retweeted a RETRO related thing a bit back so they might very well be using that for their awaited LM, here’s hoping