|
|
|
|
|
by ChrisDan121
354 days ago
|
|
Curious to hear how others have handled scale challenges in billing infrastructure: If you're running usage-based billing for AI, infra, or API-heavy platforms—
How do you deal with high-throughput event ingestion (say, 10k+ events/sec) without dropping events or messing up customer metering? We’ve seen setups struggle hard with: Event ordering guarantees Idempotency at scale Handling retries without double-counting Would love to hear what infra patterns, queues, or storage choices worked (or failed) for you—especially? |
|
Our approach focuses on: - Fire-and-forget ingestion with in-memory queues so events don’t block product requests - Strict idempotency tokens tied to every event, enforced at the API layer - Lightweight retry logic that prevents double-counting but guarantees delivery under transient failures
Storage-wise, we’ve leaned on a mix of time-series DBs for raw events and pre-aggregated summaries for billing views.
Would love to swap notes on failure patterns or queue setups if you’ve dealt with similar scale.