Hacker News new | ask | show | jobs
by 995533 2726 days ago
I would not retrain such a model on all data, just do online updates. Also I still think for that use case training times and latency are negligible (nobody cares or nobody notices any difference between training a BoW and bi-LSTM.)

If you are deploying on resource-constrainted devices (IE: low-end PC's without GPU), it is not unusual to take a lot of time training a model on a very powerful computer (which nobody cares about), then distilling or transfering the result for test time.