|
|
|
|
|
by reignbeaux
938 days ago
|
|
Unfortunately this is only part of the problem. Even studies on ML that use public datasets, which are the kinds of studies that when code is shared should be very easy to reproduce, are often surprisingly hard to repeat. Sometimes only parts of the code are published, the code has a lot of bugs (who knows why? Added intentionally?), the code is very badly documented, or the exact libraries are not specified properly. And this is in a field where everything is based on code, where in principle reproducibility is easy. Go into materials science or chemistry and try to synthesize something following a published paper and you get all sorts of problems. Different equipment, different temperature, not all steps documented, ... Reproducing experimental findings can take you months. |
|