Hacker News new | ask | show | jobs
by cypress66 1010 days ago
Llama 7B is quite dumb. Using the 13B you'd get significantly better results, and you can train a qlora on a single 3090 (I think even less is possible but not sure)
1 comments

oh yeah definitely. Do you know how I can get access to one for cheap though? I burnt through $150 just on this exercise with a P100 on GCP
I'd love to see an update for 13B, and I can vouch that vast.ai prices are very good.
Ooof. I'd expect this to cost like 5 bucks on runpod using a single 3090.

I use axolotl for training, I didn't check your notebook but axolotl likely comes with more optimized defaults for speed and vram than what you're doing.

vast.ai

Yeah, GCP GPU prices are terrible. $150 for a short time on a P100 is highway robbery.

TPUs are better, but still kinda pricey.