Hacker News new | ask | show | jobs
by mirekrusin 1126 days ago
People should be training model sizes that fit-and-fill consumer GPUs, ie:

2x 24G - for dual GPU ~ 28B model

1x 24G ~ 14B model

etc.