| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by godelski 675 days ago

As an ML researcher my advice for you is: don't

ML moves fast, but not as fast as you probably think. There's a difference between innovations in architectures and demonstrations of them in domains (both are useful, both are necessary research, but they are different).

Instead, keep up with what tools are relevant to you. If things are moving fast and aren't sticking, then in a way they aren't moving fast, are they? You're just chasing hype and you'll never keep up.

On the production side, I also see a common mistake of relying on benchmarks too heavily. I understand why this happens, but the truth is more nuanced than this. Just because something works well on a benchmark does not mean it will work well (or better than others) on your application. ResNet is still commonly used and still a great option for many applications. Not everything needs a 1B+ transformer. Consider your constraints: performance, compute, resource costs, inference time, and all that jazz. Right now if you have familiarity (no need for expertise) in FFNs (feed forward/linear), CNNs, ResNets, and Transformers, you're going to be fine. Though I'd encourage you to learn further about training procedures like GANs (commonly mistaken as an architecture), unsupervised pretraining (DINO), and tuning. It may be helpful to learn a high level of diffusion and LLMs, but it depends on your use cases. (And learn whatever you're interested in and you find passion in! Don't let need stop you, but if you don't find interest in this stuff, don't worry either. You won't be left behind)

If you aren't just integrating tools and need to tune models, then do spend time learning this and focusing on generalization. The major lessons learned here have not drastically changed for decades and it is likely to be that way. We do continue to learn and get better, but this doesn't happen in leaps and bounds. So it is okay if you periodically revisit instead of trying to keep up in real time. Because in real time, gamechangers are infrequent (of course everyone wants to advertise being a gamechanger, but we're not chasing every new programing language right?). Let the test of time reduce the noise for you.

  >  I most often can’t see any use case for AI/ML in our products

This is normal. You can hamfist AI into anything, but that doesn't mean it is the best tool for the job. Ignore the hype and focus on the utility. there's a lot of noise and I am extremely sympathetic to this.

Look to solve problems and then right tool for the problem, don't look for problems to justify a tool (fine for educational purposes).