Y
Hacker News
new
|
ask
|
show
|
jobs
by
yolorn123
3141 days ago
The reason is there only two gates for Gru, they don't have an internal state as that of LSTM, since having few parameters compared to LSTM it takes less time to train