| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by free_rms 2009 days ago

There's also

* this generation of language models leaning into transfer learning reducing the total number of training runs for different applications

* TPUs being more power efficient than GPUs (the numbers they used in the paper were based on GPUs)

* other energy-centric stuff that's not just offsets, efficiency like you mention in addition to sourcing from renewable