Hacker News new | ask | show | jobs
Ask HN: How to Do ML Research?
4 points by giveexamples 1120 days ago
For the researchers out there, how do you do research?

Background:

I've been looking at how to create a recurrent seq-to-seq model, that's not transformers. The ideas I implement do not work. It seems like off the well trodden path, there are traps everywhere - how should I tune parameters, add biases, normalize, is this dataset impossible, gradient explosion and vanishing, etc.

From a "research = gradient descent" point of view, I'm stuck at a point with no gradient - I have no idea what I'm doing wrong, or what to will get a better result. Am I missing a workflow. intuition, or tools, or other things?

1 comments

Here's some AI research links I've collected over the past while, I put them in a public gist: https://gist.github.com/TikkunCreation/5de1df7b24800cc05b482...

Karpathy's post about the research process in particular may be helpful for you

Karpathy's post is really great, thanks!