| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yolorn123 3141 days ago
	The reason is there only two gates for Gru, they don't have an internal state as that of LSTM, since having few parameters compared to LSTM it takes less time to train