|
|
|
|
|
by tastroder
2603 days ago
|
|
Glancing through the paper it seems like they use the recent Transformer model. Does whatever underlying stack they use expose something to share RNG seeds and the exact hardware optimizations your environment applies during training? Otherwise "publishing the seed" sounds nice but might not be as trivial as the phrase suggests. |
|
so, if their experiment was designed such that reproduction is inherently difficult, they should have designed it in a better way, and they should've used a toolset that wouldn't run into that problem.
a non-reproducible experiment isn't necessarily completely without value, but it's a thing that everyone should look askance at till it proves its worth.
(apologies if my comments don't apply to this experiment and if it is reproducible -- i didn't have time to read through the OP, but i thought this reply was still a worthwhile response to its specific parent comment)