|
|
|
|
|
by Rochus
1457 days ago
|
|
How do you know the sample is representative? If I look at the data posted by the OP there are about 450 clojure jobs a year; if we assume that developers switch jobs all three years in median we get about 1350 clojure jobs in total; if we assume a very good response rate for the survey, about 3% of these developers or about 40 responded; this is even less than the number of states in the US. |
|
For example, you can have a population of 10,000 jobs, 9,000 of which is hiring for Clojure and 1,000 of which is hiring for Forth. If you sample the 9,000 Clojure jobs, then you might conclude that 100% of all 10,000 jobs are for Clojure. But in reality, only 90% are.
Instead, you can sample 100 of the 10,000 jobs at random. The expected value of the average of whether a sampled job is a Clojure job will be 90%. There will be noise but that can be statistically accounted for.
If the population that you want to draw conclusions about is, say, the complete universe of jobs ever offered in the US in 2021, it will be difficult to find either a data set that contains this universe or a data set that is arguably a random subset of the universe. So representativeness is hard.
You could adjust your population definition to achieve plausible representativeness. For example, take the population of all developer jobs at companies that had an IPO between 2018 and 2021. Maybe you have a way to compile this data set from some source. Then you limit the scope of your claims but you will be more credible.
Another thing that you can do is take an existing data set that you know to be representative and compare the distribution of job characteristics in your sample to that. For example, you might find that your sample is more likely to include web development jobs than your reference data set. Then you know that your sample is not representative, and you know in what way it isn't. Or you might find that your sample is comparable to your reference data set. This can give you some confidence that your findings generalize.