Hacker News new | ask | show | jobs
by dhouston 812 days ago
Qlora + axolotl + good foundation model (llama/mistral/etc, usually instruction fine tuned) + runpod works great.

A single A100 or H100 with 80GB VRAM can fine tune 70B open models (and obviously scaling out to many nodes/GPUs is faster, or can use much cheaper GPUs for fine tuning smaller models.)

The localllama Reddit sub at https://www.reddit.com/r/LocalLLaMA/ is also an awesome community for the GPU poor :)

2 comments

Can consumer systems like the rtx3090 or 4X rtx3090 achieve something?

Have you seen benchmarks

Thank you! and yes huge fan of r/localllama :)