Hacker News new | ask | show | jobs
by icedchai 492 days ago
You have more faith in LLMs than I do. The reality is it will probably get you 70 to 80% there, then you'll spend a ton of time debugging / fixing your pipelines, only to realize it would've been simpler, faster, and more reliable to not involve an LLM in the first place.
2 comments

I believe that we'll learn how to incorporate LLMs to improve parts of data pipelines, particularly those that involve extracting unstructured or semistructured data into structured data, especially if it can provide a reliability score or confidence level with the extract. I'm much more skeptical of claims beyond that.

I also think there are unanswered questions about reliability, cost (dollar and energy), and AI business models; I don't think OpenAI can burn $2+ to make a dollar forever.

Unless you can provide some "citation", I don't think you are right. I do this every day now and it gets me 99 % there with very little debugging.
As always, "it depends." How simple are your pipelines? Single CSV? Sensible column names that are totally unambiguous? Consistent, clean data? Then LLMs are probably fine...