|
|
|
|
|
by apurvamehta
56 days ago
|
|
yes. this is current issue. there are two solutions: 1. the reason it's slow as you select more series over longer periods of time is that the series has to be pulled for each time bucket in the range, and then the samples have to be pulled for each bucket. By compacting older buckets and merging samples together, historical queries should be pretty comparable to 'more recent' cold queries.
2. We don't pre-cache all the metadata today. If we did that, then we could parallelize sample loads much more efficiently, lowering latency.
3. There is a lot of room to do better batching and tune the parallelism of cold reads. We've only been at this for a couple of months. THe techniques to improve latency on object storage are well known, we just have to implement them. Another benefit is this: all the data is on S3, so spinning up more optimized readers to transform older data to do more detailed analysis is also an option with this architecture. |
|