|
|
|
|
|
by majidazimi
2834 days ago
|
|
No it doens't, since a single partition is stored sequentially on one disk which limits the consumers to bandwidth of single disk (say c1 reads beginning of the partition and c2 end of the partition). But in the case of Pulsar c1 is most probably connected to a different node than c2. |
|
All of this is done while preserving the total ordering guarantee thanks the separation of sequencing and storage.
The operator could for example set a bigger node set size for logs that are known to have multiple consumers and require more IO capacity.
At facebook, we have use cases where a single consumer will need to replay a backlog of records in a log, sometimes hours or days worth of data to rebuild its state. We call this a backfill. Node sets allow the IO to be spread across multiple disks which improves backfill speed and helps reduce hotspots.
-- Adrien from the LogDevice team.