Hacker News new | ask | show | jobs
by searine 3987 days ago
>And how does the Right Question appear if not through exploration and manipulation of the data?

Questions don't magically come out of a data set. Doing so is called a fishing expedition and usually results in boring, descriptive results which have no impact.

To answer impactful questions, you must go into your data collection with the questions in mind. To understand what questions to ask, you need a trained, critical, and creative mind. That is something you don't get from pushing bits.

>If you can't program and manipulate data

Programming, and manipulating data is easy. Almost every new statistician these days can, and does do this routinely.

What's hard is the years of intuition about what is meaningful and what is noise.

I know. It's hard to hear, and career programmers most of all hate to hear it, but its the truth.

1 comments

Anyone I've ever heard say "programming is easy" is without fail a terrible programmer.

I'm not really sure how to respond to the idea that exploring a dataset isn't a useful way to help develop questions about it. It's only a "fishing expedition" if you have no idea what you're doing.

>Anyone I've ever heard say "programming is easy" is without fail a terrible programmer.

Development of a worldclass application, is difficult because of the complexity built into a program of large scope.

Knowing enough programming to competently move a data set around, is easy. Hell you could do most of it with just bash.

>I'm not really sure how to respond to the idea that exploring a dataset isn't a useful way to help develop questions about it. It's only a "fishing expedition" if you have no idea what you're doing.

Well I've seen a lot of it, in both science and business. People who spend a lot of time and money to generate a large data set simply because they lack a question to ask. They expect meaningful answers to just tumble out of it like mana from heaven, and end up confused and dismayed when the answers aren't impactful.

Fishing expeditions are looked down upon because they can only describe the data you generated. That is minimally useful, and can be done without grabbing a huge sample.

Good science starts with a question, then puts data to work to create new insight by removing confounding factors through careful design.