Hacker News new | ask | show | jobs
by peadarohaodha 1962 days ago
We've found you can get an order of magnitude improvement in the amount of labelled data you need - but there is some variance based on the difficulty of the problem. Because you are retraining the model in tandem with the data labelling process, there is additional compute associated to an active learning powered labelling process versus just selecting the data at random to label next. But this additional compute consideration is almost always outweighed by the saving of human time spent labelling.
1 comments

I have a question on the compute aspect regarding your business model, hope I’m not being to nosy..

I tried HL, the experience was stellar (well done!) and it made me think...

To get AL working with a great user experience you need quite a bit of compute. How are you thinking about your margins, e.g the cost to produce what you’re offering versus what customers will pay for it?

Thanks for the feedback! It's a good question re compute. There are some fun engineering and ML research challenges that we are constantly iterating on that are related to this. A few examples - how to most efficiently share compute resources in a JIT manner (e.g. GPU memory) during model serving for both training and inference (where the use case and privacy requirements permit) - how to construct model training algorithms that operate in a more online manner effectively (so you don't have to retrain on the whole dataset when you see new examples) - how to significantly reduce the model footprint (in terms of memory and flops) of modern deep transformer models given they are highly over-parameterised and can contain a lot of redundancy.

this stuff helps us a lot on the margins point!