|
|
|
|
|
by kortex
3006 days ago
|
|
I spent a half day playing around with something very similar to this. I wanted a concise language for describing data pipelines in Pandas, and was (ab)using python dunder methods (operator overloading) to this end. Like: `data | groupby, "author" | mean` Would create a graph object, which could be lazily evaluated, run in Dask, TF, etc. It started to get ugly when passing in multiple parameters into a function. I had to watch out for left and right associativity, and manage casting of arguments. It was a fun little experiment but I'm not sure how much it would actually improve workflows. If that sounds interesting, let me know and I'll poke at it again. |
|
1) Coconut: http://coconut-lang.org/
2) https://github.com/JulienPalard/Pipe
3) Pandas also has a new dataframe pipe method. https://pandas.pydata.org/pandas-docs/stable/generated/panda...
I would look at those before rolling out a custom solution.