| This is a horrible explanation of variance. And it's missing WHY we need variance, or, what is the usefulness of variance vs. other measures like mean and range. Say you want to buy a car and want to choose a brand and model based on user ratings of quality and value online. Cars A, B, and C all have the same average rating - let's say 8 out of 10. How to choose? You need more information, but all you have are the ratings. You could look at the range of ratings. This is the difference between the maximum rating and minimum rating. But what if only one or two people gave a car a bad (low) rating of 1 or 2, whereas another car had a lot of low ratings of 3 and 4, but no one rated it a 1 or 2. If you just look at the range, it might not be a good characterization of the ratings on the whole, because just one person (data point) can skew the information. You want to look at the spread of the ratings - how consistent or variable the ratings are. A car with a lot of 7, 8, 9 ratings is better than a car with ratings all over the place, that happen to average the same (8). When you buy a car with an average rating of 8 out of 10, you expect a car that is an 8. You want to minimize the chance of getting a lemon. This spread can be calculated by looking at the difference between each individual rating with the average rating. If you add up all these differences though, the negative differences with the mean would cancel out the positive differences with the mean. With variance, this difference is thus squared to make them all positive (or zero). And so on... |