|
|
|
|
|
by Chirono
740 days ago
|
|
I used to use K professionally inside a hedge fund a few years back. Aside from the terrible user experience (if your code isn’t correct you will often just get ‘error’ or ‘not implemented’ with no further detail), if the performance really was as stellar as claimed, then there wouldn’t need to be a no benchmark clause in the license.
It can be fast, if your data is in the right formats, but not crazy fast. And easy to beat if you can run your code on the GPU. |
|
More recently, we have been working on GFQL with users at places like banks (graph dataframe query language), where we translate down to tools like pandas & cudf. A big "aha" is that columnar operations are great -- not far from what array languages focus on -- and having a static/dynamic query planner so optimizations around that helps once you hit memory limits. Eg, dask has dynamic DFS reuse of partitions as part of its work stealing. More SQL-y tools like Spark may make plans like that ahead of time. In contrast, that lands more on the user if they stick with pandas or k, eg, manual tiling.