Hacker News new | ask | show | jobs
by ma2rten 2017 days ago
The pretraining is shared between models, inference is generally such more expensive.