Y
Hacker News
new
|
ask
|
show
|
jobs
by
ls612
334 days ago
This is a dumb question I know, but how expensive is model distillation? How much training hardware do you need to take something like this and create a 7B and 12B version for consumer hardware?
1 comments
johnb231
334 days ago
The process involves running the original model. You can rent these big GPUs for ~$10 per hour, so that is ~$160 per hour for as long as it takes
link
qeternity
334 days ago
You can rent H100s for $1.50/gpu/hr these days.
link