Hacker News new | ask | show | jobs
by solatic 252 days ago

  Geocodio offers a pay-as-you-go metered plan where users get 2,500 free geocoding lookups per day. This means we need to:
  Track the 2,500 free tier requests
  Continue tracking above that threshold for billing
  Let users view their usage in real-time on their dashboard
  Give admins the ability to query this data for support and debugging
  Store request details so we can replay customer requests when debugging issues
Just on the basis of what you wrote here, I'm not convinced ClickHouse is the right tool. ClickHouse very much would help with helping you crunch statistics for latencies etc., but just for billing and getting individual query data? 1) push the request to Kafka/Pub Sub/etc. 2) one consumer pushing to TigerBeetle for tracking request usage within the free tier and other billing 3) one consumer to push individual requests to object storage, which scales out infinitely-ish, allows you to get full request details for an individual request, lifecycle rules will automatically async delete old requests for you. If request statistics is important for business analysis, then instead of (boring) object storage you could look at one of the newer Iceberg-based options on top of object storage, e.g. S3 tables; as long as querying an individual request remains fast and getting statistics can be generated, say, for a nightly report. Another cheap approach could hook up another consumer to the PubSub, any request with too-high latency above a reasonable threshold, dump it into a Slack channel with a reference to the request ID so someone can look into debugging it.