Hacker News new | ask | show | jobs
by padastra 1640 days ago
This is neat! Who's the target user?

For this to be usable to me (level of knowledge: I can do all of what's done on this page in Python / R, but don't have a PhD in stats or anything), I would need:

- Some sense of how training and validation is done

- Model weights

- Something that helps interpret fitting / overfitting

I'm not sure it's super useful for someone below my level of knowledge (or maybe at my level but just don't know Python?). It seems like a random marketing person, shopify person, etc. would need:

- Better understanding of when to use regression vs. classification

- Help interpreting of whether the MAE / loss is good or bad

- Some automatic way to prevent overfitting

- Guidance on what consitutes good data and how to structure it for input

- Examples of how it might be applied to their use case

- Knowledge of how often models fail / how much they should be indexing on the model

2 comments

Thanks for the feedback!

1) The target user is: a freelance marketer, small/medium enterprise without dedicated data scientists, technical people (e.g. engineers) from other fields, or without extensive stats/ml knowledge.

2) RE:lack of hand-holding. This is likely the biggest challenge for this project: showing people how to think about ML, and how to use it to derive value for their business, without going through hours of training or lengthy tutorials.

> - Guidance on what consitutes good data and how to structure it for input

> - Examples of how it might be applied to their use case Most users are stuck at this initial phase (preparing the data).

One thing I'm working now is adding use-case focused guides: short article explaining how a realtor would go about building a model to help them roughly value houses (including data collection). I hope this helps with these two points.

> Help interpreting of whether the MAE / loss is good or bad

There's a couple things I'm working on that might help: 1) Show metric improvement relative to a baseline (e.g. MAE for a model that always predicts the mean). 2) Show both train and test curves. The current curve is only on test data.

> Better understanding of when to use regression vs. classification

> Some sense of how training and validation is done

I'm currently redesigning the UX around a step-by-step flow (for initial users at least), that should give a bit of room to explain things along the way (e.g. what classification/regression means for total beginners).

> Some automatic way to prevent overfitting

Medium-term models there'll be a mode to continuously train models to tune hyperparameters, that should help avoid overfitting. Until then it's mostly handpicked parameters (including regularization), and having tested this on Kaggle challenges it still sometimes beats my hand-written ML code :)

> Model weights

You can already download the model weights (download icon next to the model name) ; or do you mean feature importances? That's a planned feature, but it's not straightforward to implement in a generic way so might take a month or two before it's shipped.

Ah, for what it’s worth I looked for a while and didn’t see the download icon until you told me it was there, and I consider myself pretty good at picking up new user interfaces relative to your average user (e.g. became competent at Photoshop, Excel, etc. without handholding). It wasn’t intuitive that’s where the model weights would be stored so my brain never looked for a download icon.
You're right, it's way too subtle. I'll fix that!

A bigger issue is I realized the weights downloaded do not include the data preprocessing... So proper model export will unfortunately take more work before it's fully ready to use.

As a random "marketing" person, I agree. I can blindly fumble my way to train a model, but a little more hand-holding would help me understand context and instill more confidence in the results I magically get by clicking things.
Agreed, although the app is still much easier to use than other ML apps I've tried, it's still a bit too confusing/"magical" for non-technical folks.

As mentioned on the parent comment, this is the #1 priority, and I hope a redesigned flow (e.g. step by step from loading the data, picking the target, then features, and explaining things at each step) will help here. I'll also have a page with concrete use cases, including marketing.

Please reach out (email on the website) if you have some ML use case you'd like to solve with ML Console, happy to help you prepare your data etc.

Since this is client only, is there a github repo for this? (also helps with following discussions)

I get the potential usage for this and perhaps would fit as an integration in other apps too.

The project is not open source for now. I might open source certain components in the future, but the priority is to finalize the app first.