Hacker News new | ask | show | jobs
by anon373839 479 days ago
It really is an astonishing technological feat! Also note that the largest model they trained is only 8.3B parameters (8B backbone + .3B decoder). It's exciting to think that they're going to be releasing this model under an Apache 2.0 license.