|
|
|
|
|
by nico
1188 days ago
|
|
> The code runs on a 8xA100 80GB, but can also run on 8xA10040GB or 4xA100 with lower batch size and gradient accumulation steps. To get the GPUs, I suggest using Lambda Labs, best pricing for the best hardware. I wonder how much it was total in $ for the fine-tuning. Also, does anyone have some sort of table/formula that relates MB/GB of training data to $ for fine-tuning? |
|