Hacker News new | ask | show | jobs
by ska 1805 days ago
I find #1 "Subject matter experts have as much impact as data scientists" surprising only in that it was considered surprising.
1 comments

I think this is one of those points that is obvious in retrospect but almost universally under appreciated.

Almost all data science workflows treat the annotators or subject matter experts as secondary. The tooling isn't set up to put them at the centre of the process and make it easy for them to collaborate with the more technical folks.

Perhaps it should be obvious but its definitely over looked in much of academic ML and in MLops.

> its definitely over looked in much of academic ML

right - but it seems like a, if not the, first lesson you learn when you leave the classroom for the "real" world.

It's not surprising to me that any people think this way, but it seems to be a characteristic of inexperience (or narrow experience).

Maybe so but most data science workflows still don't acknowledge this "obvious" truth.
Compare this with the famous quote:

> Every time I fire a linguist, the performance of our speech recognition system goes up. - Fred Jelinek

> Every time I fire a linguist, the performance of our speech recognition system goes up. - Fred Jelinek

This one is easy to misapply. If you are applying your domain experts to the model, you might have a bad time. If you are applying them to the data, most likely not. And data is usually more important than the model.

> And data is usually more important than the model.

idk. we went from conv nets to transformers just to have the quality of our predictions go up as well as reducing the amount of data prep time by a factor of 20.

no change in data, just a better model.

in my field, improvements are nearly always made in the model. never in the data or data prep. (crowd countinf, people tracking, etc)

I very nearly said this myself!

I think the mistake of this quote is in the application of the expertise. The bitter lesson is that data + compute can outperform inductive biases but that doesn't mean you don't need domain expertise to get the right data.

The 80s called and and want their subject matter training back