|
|
|
|
|
by ironchef
1032 days ago
|
|
Here was my situation. Occasional queries. Over a couple petabyte of data. Customer facing so response in seconds per SLA but >
95 percent of the time the warehouse isn’t running. Cached queries from within 24 hours which don’t require the warehouse to even spin up. Our snowflake costs were significantly less than an FTE. Would that potentially be a situation which “running your own” doesn’t make sense? |
|
Look into datalake architectures. RDBMS based data warehousing is obviously not economical at the petabyte scale. But storing all that data in S3 with Delta Lake/Iceberg format and querying with Spark changes things entirely. You only pay for object storage, and S3 read costs are trivial.