|
|
|
|
|
by ZeroCool2u
326 days ago
|
|
Agreed, the value is nonsense. This is what we use: https://domino.ai/ The marketing is a bit intense on the website, but the docs are pretty good: https://docs.dominodatalab.com/en/cloud/user_guide/71a047/wh... They definitely target large scale companies, but you can use their SaaS offering and it can be relatively affordable. The best part is the flexibility and scaling, but the license model is awesome too. There's no usage based billing, you just pay a flat license fee per user that writes code and for the underlying cloud costs and they'll deploy it on GCP, AWS, or Azure. They're used by a lot of large companies, but academia as well to replace or augment on-prem HPC clusters. That's what we used them for as well. |
|
I'm interested in your opinion as a user on a bit of a new conundrum for me: for as many jobs / contracts as I can remember, the data science was central enough that we were building it ourselves from like, the object store up.
But in my current role, I'm managing a whole different kind of infrastructure that pulls in very different directions and the people who need to interact with data range from full-time quants to people with very little programming experience and so I'm kinda peeking around for an all-in-one solution. Log the rows here, connect the notebook here, right this way to your comprehensive dashboards and graphs with great defaults.
Is this what I should be looking at? The code that needs to run on the data is your standard statistical and numerics Python type stuff (and if R was available it would probably get used but I don't need it): I need a dataframe of all the foo from date to date and I want to run a regression and maybe set up a little Monte Carlo thing. Hey that one is really useful, let's make it compute that every night and put it on the wall.
I think we'd pay a lot for an answer here and I really don't want to like, break out pyarrow and start setting up tables.