|
|
|
|
|
by ryanworl
1040 days ago
|
|
[WarpStream co-founder here] That is correct about flushing. RE: consuming. The TLDR; is that the agents in an availability zone cluster with each other to form a distributed file cache such that no matter how many consumers you attach to a topic, you will almost never pay for more than 1 GET request per 4MiB of data, per zone. Basically when a consumer fetches a block of data for a single partition, that will trigger an "over read" of up to 4MiB of data that is then cached for subsequent requests. This cache is "smart" and will deduplicate all concurrent requests for the same 4MiB blocks across all agents within an AZ. It's a bit difficult to explain succinctly in an HN comment, but hopefully that helps. |
|