Hacker News new | ask | show | jobs
by rg111 986 days ago
This is closer to traditional SWE than AI research.

Take some courses and get some certifications. And also make some serious projects where you demonstrate your capabilities with cutting edge tools.

This is more focused on tools and use of said tools.

Take some trained models, and demonstrate how well you can use them.

Some ideas:

1. Take a cats vs. dogs model, deploy it online. Design an API around it. Document the API well. Create a mechanism to show confidence score, and store low confidence score examples in a database that you can later manually label and retrain the model with.

2. Take a smallish LLM, design a VS code extension that documents your functions based on docstring.

Just demonstrate your basic knowledge in ML, and really good software engineering skills, learn the vocabulary well, and then start applying for jobs. It's much better if you have a CS/EE degree.

2 comments

As someone in this space I could not disagree more.

Certifications will do nothing for you. The harsh reality is only real world experience doing this stuff at scale will help you understand all the complexity involved. There are tons of people trying to hop onto this train after taking a few online courses and it's making it hard to filter down candidate pools.

I ask my job candidates simple/foundational questions and even those are hard for most of them. I don't care about the degree, but I care the candidate can access and reason with the core concepts. And I value programming skills a lot.
As someone outside of this space who has taken a few of the well-regarded ML and deep learning Courseras, I agree with you. You can get a certificate without learning a single thing and just putting in a couple hours of work, and even in good faith it's hard to get a ton out of it since the assignments are so shallow.

I do think they can be valuable if they help you learn the basics and get started on a bigger personal project, but not as something to put on your resume.

I would agree on one aspect though - deploying a model at scale is much closer to SWE than it is to foundational ML research. At a high level, you're deploying a function which has some known compute requirements. It requires setting up infrastructure, monitoring/logging, API setup etc. This is the sort of thing that a good devops engineer could probably make a horizontal move to, because a lot of the practical experience is similar. I don't think you need a particularly deep knowledge of ML unless you're also expected to be involved in trying to track model performance that might require re-training. Leaving aside the really distributed systems that require multi-node, multi-GPU (but again, if you have HPC experience, that should transfer).

The problem is a lot of tutorials just show you how to make a Flask/Gradio website (maybe FastAPI) and call it a day. A lot of the experience here is the sort of in the trenches practical stuff that you can't cover in a MOOC (and it's expensive to experiment with GPU clusters). I suspect there are better non-ML courses people could take though.

rg111 has a good strategy here. For step 1, making a cats vs dogs classifier, the first lesson of the FastAI course will give you a path for doing this right away, and I wholeheartedly and enthusiastically recommend this course.

A sibling commenter mentions that certifications will do nothing for you. They're not exactly wrong, because what ultimately matters is that you can demonstrate your skills. Certifications and to a large extent even degrees mean very little; what matters is that you convince them you know how to do stuff. The best way to convince people you know how to do stuff is to be able to show a list of cool things you actually did. These courses and their certifications may not mean much on their own, but in the course of completing the courses you will develop skills and capabilities you can demonstrate and talk about in your resume and cover letter.