Very cool idea. I can't wait to play with this. I've sold similar "productized services" as a consultant. Most businesses benefit from churn, revenue/expense forecasting, and stack-rank models but not to the degree where paying in-house makes sense. I was typically brought in as part of the fund-raising process.
Thanks for the insight! We expect to keep the Google Sheets add-on free for the near future. For our business model, we are instead focusing on building fully end-to-end pipelines that use the same ML engine as for the Google Sheets add-on:
Data integration -> ML Predictions -> taking actions.
As an example, for an ecommerce store we plug into their Shopify, predict people likely to churn and send them a Mailchimp discount code.
We are running pilot projects now for these pipelines, so if you are interested do not hesitate to reach out to me on jan [at] magicsheets [dot] io
I have mix feelings. On one hand, this is pretty awesome in the sense more people should have access to different ML easier. But on the other hand, I’m guessing the people who can’t pipe a csv to pandas + scikit will probably neglect important steps like data processing and output interpretation like balanced accuracy vs accuracy.
I like to say "the dollars are in the data". Tools like these do open up ML to a wider audience but it does nothing for them to understand that nothing beats proper record keeping and data collection. Furthermore, it handicaps you in certain ways. For example, how are missing values handled?
Totally agree, getting actionable and confident results from ML is a really tough thing to do. Imputing data, feature processing, metric choices, and etc is almost an art form (though new AutoML libraries have made this much easier).
Totally understand your mixed feelings. We are trying to do as many standard data preprocessing steps as we can (cleaning, normalizing, one-hot encoding etc.) and are now switching over to an AutoML engine. We basically see the outputs of Magicsheets as great baseline models, Data Scientists could definitely do better (especially with domain knowledge) but we should at least be able to give some useful predictions back for most problems :).
I hear this a lot but I haven't seen too much of it in the real world. In my experience it's a competitive field to get into so even the junior people are pretty decent. They have to be.
I just want to say that I love this. I’m not your target audience, but I think the idea and execution are pretty damn neat and I hope you succeed.
I’d try to put some focus on helping users who aren’t ML savvy understand the different algorithms and hyperparameters, the choice can be confusing otherwise.
Hi there, thanks for the kind words and advice! Our next step is building an AutoML engine that indeed takes algorithm selection and hyperparameter choice out of the hands of inexperienced users.
as a warning, I've heard that the big spreadsheet makers considered this and eventually passed on the idea. I believe the logic is that spreadsheet users don't understand ML, and if you want to do ML, you probably know how to export a spreadsheet.
Hi Nalta! Thank you for your advice. We noticed that indeed most spreadsheet users do not do well with ML logic - as they are used to Excel functions, which have an input -> output structure (not validation cycles, hyperparam optimization etc.). We are building an AutoML engine, exactly for this reason - to indeed take everything "ML" away for spreadsheet users. Our ultimate goal is to turn ML literally into an Excel function. Super challenging problem, but really exciting at the same time :).
A significant amount of ML work is done in Excel with more on the way. For example, see the book "Data Smart: Using Data Science to Transform Information into Insight" by John W. Foreman.
There oughtta be a "law" about this; something like "Every software system eventually has an Excel implementation" or such.
What is your business model?