I'm excited for LLM applications that can setup, monitor/validate, and optimize data pipelines at scale. Seems possible soon given that SQL and most data records aren't intended to be human-friendly
When LLMs can do the following, they might be able to fix data hell:
- Negotiate with different teams to figure out what a field means
- Be told that a field should be converted from one format to another, but oh wait it's causing errors somewhere downstream because it was told the wrong instructions
- People come to you with some issue about the code you maintain, and you dig enough to realize the root cause is another team's code