Hacker News new | ask | show | jobs
by osanseviero 891 days ago
Hey @godelski! Author of the blog post here.

I really appreciate you taking the time to provide all this feedback. This feedback + additional resources are extremely useful.

I agree that the subtitle is not as accurate as it could be. I'll revisit it! As for content updates, I've been doing some additional updates in the last days based on feedback (e.g. more info about tokenization and the token embeddings). Although diving in some of your suggestions is likely out of scope for this article, I in particular agree that expanding the attention mechanism content (e.g. the analogy with databases or explaining what is dot product) would increase the quality of the article. I will look into expanding this!

I also think a more rigorous, separate mathematical exploration into attention mechanisms and recent advancements would be a great tool for the ecosystem.

Once again, thank you for all the amazing feedback!

1 comments

Hey, I'm glad you found it useful. I know it is hard to take critique, but I did enjoy the post. I truly do mean the critique is coming from a place of love. And I hope the comment helps others find more (I guess I'm writing a blog post now). I do feel there is often this gap between nearly no math and way too much math that causes a lot of people to come away with "you don't need math for ML" which is... idk... partially correct but not? haha. I'm a bit mathy of a person so you just caught a pet peeve of mine. I definitely agree what I said is out of scope for how you wrote but I will stand with my subtitle critique ;) I still do like the article though

And I just realized we're in a slack channel together haha (I don't think we've ever talked though). I poked around your website and saw you're at HF. Love you guys to death. You all also have tons of awesome blog posts and you're one of the most useful forces in ML. So I really do appreciate all the work.