|
|
|
|
|
by reedciccio
380 days ago
|
|
Search more, there is a lot of literature discussing how hard the problem of reproducibility of GenAI/LLMs/Deep Learning is, how far we are from solving it for trivial/small models (let alone for beasts the size of the most powerful ones) and even how pointless the whole exercise is. |
|
There simply aren't that many sources of non-determinism in a modern computer.
Though I'll grant that if you've engineered your codebase for speed and not for determinism, error can creep in via floating point error, sloppy ordering of operations, etc. These are not unavoidable implementation details, however. CAD kernels and other scientific software do it every day.
When you boil down what's actually happening during training, it's just a bunch of matrix math. And math is highly repeatable. Size of the matrix has nothing to do with it.
I have little doubt that some implementations aren't deterministic, due to software engineering choices as discussed above. But the algorithms absolutely are. Claiming otherwise seems equivalent to claiming that 2 + 2 can sometimes equal 5.