| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by lemursage 740 days ago

In larger companies, and, specifically, bigger projects, systems tend to have multiple ML components, and those are usually a mix of large NN models and more classical (ML) algorithms, so you end up tweaking multiple parts at once. In my case optimising for such systems is ~90% of the work. For instance, can I make the model lighter or go faster and keep the performance? Or, can I make it go faster? Loss change, pruning, quantisation, dataset optimisation etc. -- most of the time I'm testing out those options & tweaking parameters. There is of course the deployment part, but this one is usually a quickie if your team has specific processes/pipelines for this. There's a checklist of what you must do while deploying, along with cost targets.

In my case, there are established processes and designated teams for cleaning & collecting data, but you still do a part of it yourself to provide guidelines. So, even though data is always a perpetual problem, I can shed off most of that boring stuff.

Ah, and of course you're not a real engineer if you don't spend at least 1-2% of your time explaining to other people (surprisingly often to a technical staff, but not ML-oriented) why doing X is a really bad idea. Or, just explaining how ML systems work with ill-fitted metaphors.