Hacker News new | ask | show | jobs
by onislandtime 4432 days ago
Thinking people (as oppose to those dedicated to promotional activities) should stop using the term "data scientist". All scientists are data scientists, otherwise we would call them philosophers. Data for the sake of data is not a science. While you are at it, please also stop using the term "big data", (often people mean: do something with the data), if you need to use a computer cluster and MapReduce because the data doesn't fit in your Mac, then refer to distributed data stores and computing systems. Also, please drop the term "deep learning" when you refer to using more compute power to run more complex models. Thanks.
4 comments

Deep learning is suss, I think, but I think that the name is apt - learning more than 3 layers in a network is Deep...

I spent a lot of time ~1995 trying to learn complex neural networks and I have an appreciation of the difficulties and the opportunity that big modern clusters therefore afford us.

"Deep learning" has an actual meaning, which is the use of neural networks with multiple hidden layers. (Networks with one hidden layer can theoretically approximate any mathematical function, but it's the investigation of deeper networks, with more, that has reinvigorated neural net research over the past few years). I'm sure it is being misused, but there is a legitimate, technical meaning to it.

"Data scientist" seems to be a way for mathematically literate programmers to separate themselves from the teeming masses of commoditized ScrumDrones. It seems to mean, "this person is smart enough to deserve dibs on the most interesting work". Perhaps it's an attempt to back to the R&D culture that existed before biztards commoditized us and our work.

Most of the fuss around "data science" makes me think of the Fundamental Theorem of Employment. If you're hired for a job, it's typically either (1) to do a job the person hiring you can't do for himself or (2) to do a job he doesn't want to do. Type-1 workers are respected and have autonomy. Type-2 workers are generally ill-regarded (because the boss thinks he can do the worker's job). "Data Scientist" seems to be a way for a programmer to say, "Only hire me for Type-1 work".

I can't say I'm a huge fan of the title's existence, because most companies use "data scientist" as Biztard for "person who does watered-down machine learning", but I suppose the current climate is an improvement over the AI winter.

Yes, the latest work on NN is a breakthrough for sure. So are the advancements in distributed computing and storage that make low-cost scalability possible. However, we should resist getting sucked into marketing terms and buzz words. Terms like "self-driving car" are good because they are descriptive, accurate, and imply a paradigm shift. On the other hand, I may be wrong, for example the term "microprocessor" seems to have trascended relative size and is used to refer to a type of computer processor on an integrated circuit. Language evolves but perhaps we can influence by choosing good meaningful names when we can.
Thank you so much.

I am a scientist and these silly buzz words, while they're great for salesmen that work for businesses, drive me up the wall!

Regarding "deep learning", you are totally right. Deep learning refers to a particular formulation that may inspire development of a different class of algorithms.

An anthropologist might disagree with you. Not all science relies on data, some scientists prefer "case studies".
I work with anthropologists. Their research relies entirely on data. It wouldn't be science otherwise.