| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by albertzeyer 1046 days ago
	Whenever there is some working existing implementation of a model (and maybe even checkpoint), the most effective way to be sure your model implementation is correct is to import such an existing checkpoint and compare the model output. If it does not match (which is almost always the case, as you likely got some details wrong), you can systematically go through each of the layers. You will figure out the real differences and learn. Maybe you will even find some oddities in the existing implementation. This is about the model itself. Training is another aspect. But usually after having the hyper parameters more or less similar, this should be fine, if the model is correct.