| In the tweet Jeff Dean says that Cheng at al. failed to follow the steps required to replicate the work of the Google researchers. Specifically: > In particular the authors did no pre-training (despite pre-training being mentioned 37 times in our Nature article), robbing our learning-based method of its ability to learn from other chip designs But in the Circuit Training Google repo[1] they specifically say: > Our results training from scratch are comparable or better than the reported results in the paper (on page 22) which used fine-tuning from a pre-trained model. I may be misunderstanding something here, but which one is it? Did they mess up when they did not pre-train or they followed the "steps" described in the original repo and tried to get a fair reproduction? Also, the UCSD group had to reverse-engineer several steps to reproduce the results so it seems like the paper's results weren't reproducible by themselves. [1]: https://github.com/google-research/circuit_training/blob/mai... |