| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tylerekahn 1226 days ago

It’s actually as low as 0.01% of the original weights.

From the LoRa paper:

>When the pre-trained model is GPT-3 175B, the number of train- able parameters |Θ| can be as small as 0.01% of |Φ0|.