| HN Mirror

Training what on what - Resnet50 on imagenet? Yeah sure a single 4090 is fine. Will take a bit.

A 1.5B parameter LLM? That’s a few weeks with 64 V100s - on a small dataset.

Training something Lllama 7b class? (Not using lora)? Weeks with the same number of A100s.

With lora? Back to a single 4090 - depending on your dataset. It still might take weeks to go through 2000 examples for finetuning with a large context size.