Hacker News new | ask | show | jobs
by easygenes 408 days ago
This is very light and approachable but stops short of building the statistical intuition you want here. They fixate on the smoothness of squared errors without connecting that to the gaussian noise model and establishing how that relates to the predictive power against natural sorts of data.
3 comments

It isn't too hard to find resources on this for anyone genuinely looking to get a deeper understanding of a topic. I think a blog post (likely written for SEO purposes, which is in no way a knock against the content) is probably the wrong place that kind of enlightenment, but I also think there are limits to the level of detail you can reasonable expect from a high level blog post.

And for introductory content there's always that risk if you provide to much information you overwhelm the reader, make them feel like maybe this is too hard for them.

Personally I find the process of building a model is a great way of learning all this.

I think a course is probably helpful, but the problem with things like data camp is they are overly repetitive and they don't do a great job of helping you look up earlier content unless you want to scroll through a bunch of videos, where the formula goes on screen for 5 seconds.

Would definitely just recommend getting a book for that stuff, I found "All of statistics" good, I just wouldn't recommend trying to read it from cover to cover, but I have found it good as a manual where I could just look up the bits I needed when I needed it. Tho the book may be a bit intimidating if you're unfamiliar with integration and derivatives (as they often express the PDF/CDF of random variables in those terms).

>I think a blog post... is probably the wrong place that kind of enlightenment

There's this site full of cool knowledgeable people called Hacker News which usually curates good articles with deep intuition about stuff like that. I haven't been there in years, though.

Yes, and it seems like it could’ve been written in-part by an LLM. But, the LLM could take your criticism, improve upon the original, and iterate that way until you feel that it has produced something close to an optimal textbook. The one thing missing is soul. I noticeably don’t feel like there was anyone behind this writing.
Ah, we’re resorting to ad machinum today. :)
Any resource/link you know of that further develops your point?
CMU lecture notes [0] I think approach it in an intuitive way, starting from the Gaussian noise linear model, deriving log-likelihood, and presenting the analytic approach. Misses the bridge to gradient methods though.

For gradients, Stanford CS229 [1] jumps right into it.

[0] https://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/06/lectu...

[1] https://cs229.stanford.edu/lectures-spring2022/main_notes.pd...

Thanks! will have a look..