Hacker News new | ask | show | jobs
by thecolorgreen 845 days ago
Why doesn't Equation 1b use the h' defined in Equation 1a?
2 comments

Hey! OP here Great question - h' in Equation 1a refers to the derivative of h with respect to time (t). This is a differential equation which we can solve mathematically when we have x in order to get a closed-form solution for h. We would then plug in that h (the hidden state) into equation 1b.

In our case, we don't actually wait for a closed-form solution but instead compute the discrete representation (Equation 2)

Hope that helps!

I believe h' is for the next state. y(t) is to predict the next word so it uses the current hidden state h(t).