Hacker News new | ask | show | jobs
by LukeShu 2777 days ago
I used to work at a startup that made physical robots. The robot generated several GBs of data every time it turned on. You're correct, most of that data wasn't looked at most of the time. But every now and then, someone would say "Hey, I saw a robot do something funny the other day, what the hell happened?" And having all that data usually made it possible to figure out what happened. To me, "maximum data for the hell of it" isn't about generating insight by looking at trends, it's about generating insight during incident analysis.
2 comments

> To me, "maximum data for the hell of it" isn't about generating insight by looking at trends, it's about generating insight during incident analysis.

Agree 100%.

That is a very particular use case, where very high res data is critical. I note that even here, you're interested in data from "the other day", not years ago.

In most cases, time spent maintaining terabytes of rapidly aging time series data would be better spent elsewhere.

I think that really depends on the case.

A particularly good high-frequency trader might be interested in Terabytes of minutia when they're trying to sort out what caused yesterday's spike and crash of ticker XYZ.

Systems and sales analysts that are looking at web store front ends (and back ends, if there are issues) would be interested in large volumes of data, specifically corner cases (users who don't follow a statistically significant path), when trying to sort out a UI/UX redesign.

Traffic and transit analysts might want terabytes of data (especially with date and weather indicators) when considering what kind of freeway interchange to add to a growing area.

I suppose I could go on...