Is the error distribution of task size estimations normally distributed? Because I do really expect it to have a fat tail, and if it does, you can't add means either.
There’s a variety of analyses out there and they very consistently show a log-normal distribution for release predictions. I’ve analyzed Star Citizen’s publicly available data and found the same for their task estimates. It’s very reliable.
You do see truncated log-normals, though, when the estimates are padded.
I think most of us in software engineering assume the probability distribution has a fat tail. I've seen some authors name this the "blowup factor". For instance, your most likely estimation is 10 days, best case is 5 days, and worst case is 30 days. I think adding means is still meaningful (see central limit theorem and law of large numbers).
You do see truncated log-normals, though, when the estimates are padded.