|
|
|
|
|
by wildengineer
2615 days ago
|
|
It sounds like your org needs two things, 1. A data warehouse for this data 2. Awareness of software/data best practices That being said, while I agree code duplication is bad, data duplication isn't as long as you are maintaining data lineage. In some cases data duplication good. I also wouldn't care too much that you have 100Gb max in a big data architecture. So what? It's not like you're going to be able to get rid of it easily. A data warehouse built from a new set of pipelines seems like the biggest bang for your buck. |
|