Hacker News new | ask | show | jobs
gauged: a time series database (github.com)
22 points by chrisohara 4468 days ago
2 comments

Interesting storage method; I like it. Some notes:

- A MySQL backend should be very fast, but scaling it into multiple shards will have to be an exercise for the user. Perhaps CitusDB (PostgreSQL compatible) could be useful here.

- Metrics cannot be tagged. This will make it useless for any sort of rollups or breakdowns ("give me the sum of requests over my servers in the XYZ data center"; "give me the requests for each server in the XYZ data center by hostname").

The second issue in particular needs attention before it can complete with enterprise-grade metrics solutions such as Datadog.

Tagging would be the next addition. I added the ability to search for keys by prefix quite efficiently, so provided one stored keys like "requests:server1", "requests:server2", one could easily run the following

    requests = 0
    for key in gauged.keys('requests:'):
        requests += gauged.aggregate(key, Gauged.SUM, start=-Gauged.WEEK)
Tagging needs to be multi-dimensional to be effective (e.g., host=X, device=Y, interface=Z, etc.)
I'd love to see an honest comparison against RRDtool. You mention it once in "Support for sparse data (unlike the fixed-size RRDtool)".

What are the other advantages, disadvantages and trade-offs?