Hacker News new | ask | show | jobs
by mkaltenecker 5080 days ago
It’s quite astonishing that so many people here prefer readability over correctness. What is wrong with you? Readability is optional. Correctness is not.

That’s obviously the wrong tradeoff. Never outright manipulate your data to make it more readable! If you can’t make your data readable enough without manipulating it you just can’t present your data that way. Period. Find a better way.

(I do not think there is any malice involved, though. Just pure stupidity. I mean, look at the amount of people around here arguing for readability over correctness. If they are out here, some are also working for Nielsen.)

3 comments

There's more at stake than that -- malice or not, a simple, tidy diagram can bypass a human's critical faculties and increase the feeling that the false data (and whatever it's conclusion) are more true. For a statistical research organisation, this is straight up neglect!

Part of my job involves electrical drafting, and though I'm positively anal about my drawings, if I manipulate the design to make the drawing more clear and presentable, it's a snowball's chance I'd ever be excused.

In my case, the risk is starting a fire in a power supply; in theirs, manipulating the market death of a product, however many dollars that might entail. Professionals have higher standards in their field, precisely because we trust them with expert information and give more weight to their decisions. Stupidity is no excuse in this case.

How much does it matter, though? A pie chart would paint a very different picture, with iOS as a clear minority.
Here's how this looks like as a pie chart (Nielsen & Comscore data), by the way: http://www.asymco.com/2012/07/13/how-many-lumia/
How that? Android would be 183°, iOS would be 122°, RIM would be 32°. You could easily see that Android is slightly above 50% – but I’m not really sure how the iOS marketshare would look markedly different.

Since 50% isn’t really a very important threshold (though that could be argued) I would very much argue against using a pie chart. (I prefer areas or lengths to angles.) Plus, a pie chart wouldn’t make it easy to add platform subdivisions.

It's a log scale. The bottom axis should use tick marks to make it is obvious. A poorly marked log scale is certainly something to fix. It's not the same as fudging the data.

[edit: If you wish to say a log scale is inherently misleading in this context ... go for it. That's different then saying the data is manipulated.]

[Edit2: The area is not meaningful. The widths are meaningful if there is a total ordering and the scale is labeled. ... I do however agree I am probably way overestimating how obvious a [log(1+cumulative percentile),log(1)] mapped to the xaxis is. Also the chart does scream compare areas, and those are strictly meaningless unless comparing within the same OS.]

Can someone explain how a log scale could work for this sort of stacked graph? (Serious question, not rhetorical question.)

If the X axis is a log scale of market share, what would happen if Apple and Android both had 40% market share? Both bars would overlap.

If X is cumulative market share, the bar width would depend on order and the two hypothetical 40% companies would have different widths.

If each bar has area proportional to the log, how would that work? The logs of market share are negative unless there is an arbitrary constant in there. Also the vertical breakdown doesn't make any sense in that case, because the areas of the vertical blocks don't add up to the OS total. Also small market shares would have negative width?

So can someone explain how log scale could even theoretically work here?

Since when is a log scale used for market share? Ever seen a log scale of browser usage?
Since when is a log scale used for market share?

At least since the appearance of data sets that lent themselves towards such visualization. For example, how better to compare the growth of various platforms, from TRS-80 days to the iPad while the entire industry grows exponentially?

http://www.asymco.com/2012/01/17/the-rise-and-fall-of-person...

Edit: But I guess that's really market magnitude over time, not quite what you asked.

Two reasons:

1. The numbers are right there for all to see.

2. Nielsen's business is not to make academic-grade charts. Their version is far, far better for their customers' needs.

The first is a "it's not a lie because we explain it in the small print" argument. Technically true but practically false.

The latter is just a bare assertion. Got proof? I'd bet not. Which is why you had to say "far, far", hoping that people would just go along with you.

Personally, I'd think that Nielsen's business is to make sure their customers know what's going on. That's not an argument for running the second graph; it's an argument for making a third graph that conveys the correct intuition. Or just to publish a table of numbers.

Personally, I'd think that Nielsen's business is to make sure their customers know what's going on.

In my experience, that is not the case. The information they provide is generally used by middle managers in large companies in internal powerpoint decks with the intent of waging intra-company warfare. The use of the the info is highly political and opinionated, not rational and academic.

So, yeah, it would be preferrable to put out an immaculate chart with perfect proportions, good design, and clear text. But often it's just easier to cram the words in and make it fit. The bottom line is that the intended target of these charts just does not care about these details. They have an agenda of their own, and will use the Nielsen data to advance it. For Nielsen to spend time and money obsessing over these sort of things woud go largely unappreciated.

Is it great? no. Even good? no. Does it meet their customers' standards and needs? Yes.

"For Nielsen to spend time and money obsessing over these sort of things woud go largely unappreciated."

This would be extremely short sighted thinking.

I'm a big beleiver in the art of not doing work that's unnecessary, but the art is in knowing when it matters. When not doing the work directly contradicts your brand's supposed strengths publically that's a problem.

Nielsen's brand is built upon a reputation of high quality and detailed demographic data. This is the basis on which customers buy data from Nielsen and what gives that hypothetical middle manger's powerpoint slide some weight. "This is from Nielsen so we can trust that it's good data not some up-and-to-the-right chart I tortured out of our data."

Events like this damage that brand. The damage may not manifest itself directly in sales up front but long term if the weight of "this is from nielson.." is gone then even in the cynical case where all the customers are clueless Nielsen will lose out to another data provider that has the right reputation.

The customers aren't clueless. They just have other cares.
Even worse IMO.
Note that you've gone from "far, far better for their customers' needs" to "just as good for their (pointless) purposes" but cheaper. Which is a much weaker argument.

I also think your "nobody cares about the data" argument is weak. But I suspect you know that already, and were just trying to argue your way out of a hole so I won't belabor it.

Having worked in newspapers creating graphics exactly like this, my editors would have made me publish a correction explaining this mistake. It was always my understanding that if I made a number of errors like this, I'd quickly find myself looking for a job.

A statistics and data-driven company like Nielsen should be ashamed, double so if they haven't published a correction.