|
|
|
|
|
by twstws
5133 days ago
|
|
I think the default should be the method that displays the most information. Why hide information if you don't have to? In the case of one dimensional data, a dotplot shows the reader everything. Using a boxplot reduces information content, mean-plus-errorbars reduces this further. The mean plus errorbars imposes a probability distribution, which may be wrong, it doesn't reveal a hidden truth. The same holds in two dimensions. Show me all the data, and include a regression line or a spline to highlight a trend. Only start hiding information when the scatterplot becomes misleading. That is, when overplotting prevents me from accurately assessing the actual distribution of the points. Jumping immediately to a density plot also restricts me to your interpretation. The original data is lost. With a scatterplot, the raw data can be recovered from the plot, so i can do my own analysis should i be interested. This is common in meta-analyses that extract data from multiple published papers. If those original papers had used density plots instead of scatterplots, reanalysis will require direct access to the underlying data. Once the original author dies, or loses the data, all further use of the data is lost. |
|
Best practice: http://www.nytimes.com/interactive/2012/05/09/us/politics/sa...