|
|
|
|
|
by tuukkah
2778 days ago
|
|
TLDR: "Why Not to Build a Time-Series Database? Because we're building one and you should pay us." > Hopefully our story will make you think twice before trying to build your own TSDB in house using open-source solutions, or if you’re really crazy, building a TSDB from scratch. Building and maintaining a TSDB is a full time job, and we have dedicated expert engineers who are constantly improving and maintaing our TSDB, and no doubt will iterate the architecture again over time as we hit an even higher magnitude of scale down the line. > Given our experience in this complex space, I would sincerely recommend you don’t try and do this at home, and if you have the money you should definitely outsource this to the experts who do this as a full time job, whether its Outlyer or another managed TSDB solution out there. As so many things turn out in computing, it’s harder than it looks! |
|
When I see:
"You Can Lose a Few Datapoints Here and There"
I see that these guys are barking the wrong tree.
1. We used single thread per network card. (Yes, we architected clusters/failovers, etc... but not once was it required because of data rates)
2. The server could handle a fully saturated Gibit network at <50% CPU (per core)
3. Data was NEVER thrown away (but we had allowances in our API to let the client reading the data to drop updates and get sub-second aggregates instead -- eg OHLC or summation)
4. Data was stored in basically flat file systems.
5. Our calculation engine was run 'downstream' toward the client ends, or on the client end, away from data collection. If needed (ie. the calcs were expensive to run), these could feed back into the server for long term storage.
This was mid 2000. I'm sure this is not rocket science for modern day timeseries guys.