| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cccybernetic 1019 days ago

A blog on using using LLMs to clean, process, and enrich data. It includes prompts and code snippets. The post draws on my experiences and two really interesting papers:

- Can Foundation Models Wrangle Your Data? (https://arxiv.org/abs/2205.09911)

- Large Language Models as Data Preprocessors (https://arxiv.org/abs/2308.16361)

I cover:

- Error and Anomaly Detection

- Enriching Data with LLMs

- Matching Data Labels

- Identifying Matching Records

Thank you and I'd appreciate your feedback.