Hacker News new | ask | show | jobs
by kn7 2759 days ago
We store the real-time content stream in a separate bulk storage unit (e.g., BigQuery) with a certain retention window, but the ETL'ed documents are always on ES. Given a plain event (i.e., not ETL'ed document) is not much of a value for search, I would not call the stream storage as the primary storage. It just assists us to re-build the ETL state in case of an emergency.