Don't they use different hardware for inference and training? AIUI the former is usually done on cheaper GDDR cards and the latter is done on expensive HBM cards.