|
|
|
Ask HN: Who is actually doing something with "Big Data"?
|
|
5 points
by cbo
5259 days ago
|
|
I'm not doubting the validity of the term. Obviously it exists. The big companies -- Google, Facebook, Twitter, etc. -- all have enormous datasets to work with, so of course it's real. But I read this term so much that it seems as though EVERYONE is working with it, and that just doesn't seem feasible to me. Do you work with "Big Data" at your (preferably not aforementioned) company? What do you do with it? Where do you get it from? Are there any significant pain points? |
|
The biggest challenge is that data feeds originate in nearly all of those countries and also need to be distributed efficiently to every other country. (e.g., NASDAQ originates in the US and reaches around the globe, and the same is true for realtime feeds on the opposite side of the globe in the Middle East, India, Singapore, Hong Kong, Tokyo, etc.) The Internet is not reliable from a latency point of view so coupled with the required hardware is the required network. We operate one of the largest private networks in the world.
edit: Also, from a processing point of view we have had great success with speeding up complex algorithms that would normally take minutes to run across huge compute clusters, bringing them down to seconds by porting them to run on large GPU clusters. Certain things are definitely suited for running on GPUs, but I feel it is still pretty foreign to most programmers and hard for companies to decide to jump into that kind of project. You're starting to see more specialized use of GPU or slower-clock-but-massively-parallel compute devices for a wider variety of tasks. (e.g., http://gigaom2.files.wordpress.com/2011/07/facebook-tilera-w...)