| I go back and forth on this. A year ago, I was optimistic and I have had 1 case where RL fine tuning a model made sense. But while there are pockets of that, there is a clash with existing industry skills. I work with a lot of machine learning engineers and data scientists and here’s what I observe. - many, if not most MLEs that got started after LLMs do not generally know anything about machine learning. For lack of clearer industry titles, they are really AI developers or AI devops - machine learning as a trade is moving toward the same fate as data engineering and analytics. Big companies only want people using platform tools. Some ai products, even in cloud platforms like azure, don’t even give you the evaluation metrics that would be required to properly build ml solutions. Few people seem to have an issue with it. - fine tuning, especially RL, is packed with nuance and details… lots to monitor, a lot of training signals that need interpretation and data refinement. It’s a much bigger gap than training simpler ML models, which people are also not doing/learning very often. - The limited number of good use cases means people are not learning those skills from more senior engineers. - companies have gotten stingy with sme-time and labeling What confidence do companies have in supporting these solutions in the future? How long will you be around and who will take up the mantle after you leave? AutoML never really panned out, so I’m less confident that platforming RL will go any better. The unfortunate reality is that companies are almost always willing to pay more for inferior products because it scales. Industry “skills” are mostly experience with proprietary platform products. Sure they might list “pytorch” as a required skill, but 99% of the time, there isn’t hardly anyone at the company that has spent any meaningful time with it. Worse, you can’t use it, because it would be too hard to support. |
More than once I've just done labeling "on my own time" - I don't know the subject as well but I have some idea what makes the neurons happy, and it saves a lot of waiting around.
I've found tuning large models to be consistently difficult to justify. The last few years it seems like you're better off waiting six months for a better foundation model. However, we have a lot of cases where big models are just too expensive and there it can definitely be worthwhile to purpose-train something small.