Hacker News new | ask | show | jobs
by pitah1 965 days ago
Thanks for the response. I also noticed there was a mention of data contracts or Pydantic to keep your data clean. Would it make sense to embed that as part of a DLT pipeline or is the recommendation to include it as part of the transformation step?
1 comments

You can use pydantic models to define schemas, validate data (we also load instances of the models natively): https://dlthub.com/docs/general-usage/resource#define-a-sche...

We have a PR (https://github.com/dlt-hub/dlt/pull/594) that is about to merge that makes the above highly configurable, between evolution and hard stopping: - you will be able to totally freeze schema and reject bad rows - or accept the data for existing columns but not new columns - or accept some fields based on rules'