|
|
|
|
|
by ergest
2623 days ago
|
|
They built a pipeline that complicated for 100gb? That’s insanely over-engineered! Very typical of engineers who just want to pad their resume at the expense of unsuspecting business people. I’ve worked with single server data warehouses on SQL Server that were 10x in size and served the entire company. I don’t know what your data looks like, whether it’s just transactional or a combination of transactional and raw server/app logs. You could ETL the raw logs into an RDBMS like Postgres but you have to worry about maintaining it though and it doesn’t sound like you have enough resources for that. To do that you need help from IT/ops to set up a replica of the live server so it can be queried without disrupting transactional operations and then write ETL code or use a service like Stitch or Panoply. You can also use a cloud platform like Google BigQuery or AWS Redshift to dump raw data in and then create views and table extracts for all the commonly used business functions. That’s still overkill though and a simple RDBMS should suffice. And if you want to raise awareness see this article by StichFix and the HN comments https://news.ycombinator.com/item?id=11312243 |
|
Or they were given the same PR crap you always get from sales people that they’re just days away from tripling the number of clients and by next year they should be 10-20x the number, so they went ahead and “built it right” so they wouldn’t run into the inevitable scaling issues they were supposedly assured to hit in short order?