Hacker News new | ask | show | jobs
by kite_and_code 2515 days ago
Thank you for pointing out TFDV (Tensorflow Data Validation) - I had not seen it so far.

And yes, as I say in the video, we used the Trifacta Wrangler Free Version to illustrate the vision of what we aspire to build. In the end, it will look different of course and we have some ideas on where we would imagine a completely different user interface. If this will be better or worse remains to be seen..

And thank you for the comparison of Trifacta and pandas. And I agree, that pandas won't be able to handle any dataset size. However, I wonder if the data set size can be increased if we also work in the cloud on machines with a larger RAM. Or, maybe even export Dask code instead of pandas code.

So, you seem to have experience working with Trifacta Wrangler. Is there something that you don't love about their solution?