Hacker News new | ask | show | jobs
by prepend 2566 days ago
I really like how they implemented the data catalog [0] so that it’s yaml-based and also has a paths-style cascading method of files that can be common across or within teams as well as personal for individual projects. I think this makes it easy to build up with tools for meta analysis (how many data sets are used, etc) and even viz using a variety of tools rather than having the metadata management tied to a system or product.

Are there other techniques for data catalogs that are file based or at least open standard based that scale all the way up from developer?

[0] https://kedro.readthedocs.io/en/latest/04_user_guide/04_data...

1 comments

There's the intake project from the Anaconda folks.