Hacker News new | ask | show | jobs
by cachemiss 2608 days ago
Generally these tend to be systems that work with machine generated data, my experience is with sensor data generated by automobiles (automated car efforts).

Naive solutions tend to either summarize the data, store as logs and then run batch processes to index in some form (or leave unindexed and just brute force the computation), or limit the incoming data rate to whatever could be indexed.

These can work for some use cases, but make it very difficult to operationalize these data sources (i.e use them to make real-timeish decisions).

Even human generated data sources (fb / twitter etc.) can generate something close to that data rate.