Hacker News new | ask | show | jobs
by sitkack 1044 days ago
> easy by replacing brittle SQL scripts with object-oriented Python

There is a lot to unpack here. Can you explain this in more detail?

1 comments

Sure, our approach is to define Python classes to handle reusable steps for reading, transforming or loading data. For example, we have a MapTransformer, CastColumnsTransformer, GeoDistanceTransformer.

Each class specifies some configuration needed for the "step" and can then be used in the config file to construct a full ETL job. You can write unit tests for custom transformers you create as we have shown in the tests/ directory.

I have also updated the README in our repo to hopefully provide a better explanation of how our config file connects to specific Python objects.