Hacker News new | ask | show | jobs
by gkbrk 753 days ago
To compare, the PyTorch repo has ~400k lines of C, ~850k lines of C++ and more than 1.5 million lines of Python code.

PyTorch does more than tinygrad, but does it really do 343x more things?

6 comments

If PyTorch does the 1-2 things you need and Tinygrad doesn't do, then what are you going to use?

The Python source distribution has long maintained the philosophy of “batteries included” – having a rich and versatile standard library which is immediately available, without making the user download separate packages.

https://peps.python.org/pep-0206/

OTOH:

  Simple is better than complex.
  Complex is better than complicated.
https://peps.python.org/pep-0020/
PyTorch of course. Or alternatively a lib or custom code on top of TinyGrad. Is that a problem?
geohot explained on one of this streams, and per my terrible memory: “tiny” is a way of expressing the architecture constraint that the system should not attempt to target [(many hardware architectures and their optimizations) * (many model, training, etc etc variants)] like PyTorch - which requires maintenance of a shit ton of code and a staff/community behind Meta. Instead, tinygrad should provide core abstractions that can be composed to accomplish a similar set of targets but for only one hardware architecture (for now I guess). He is releasing a companion hardware item which would fund the development I believe.
I think you massively underestimate the complexity of pytorch. Even if we exclude all GPUs except for AMD, and exclude clang (required for AOT engine), pytorch depends on almost every ROCm library. And inside it depends on original Triton library, and on forked Triton, and on aotriton, which depends on forked MLIR (because AMD MLIR don't contribute these changes to upstream), which depends on another forked LLVM/Clang (because LLVM api is not stable enough for them, I guess). And then there is MIOpen/rocBLAS/hipBLASlt/hipSOLVER/rocFFT/etc - libraries with gigabytes (!) of autogenerated code. Additionally, there are dozens of smaller linked libraries like oneDNN, LIBXSMM, magma, numpy, openBLAS, all needed for running "things". So even without autogenerated code, consider multiplying 1.5 million LOC to 100.
Probably.
Easily
uh, ya? lol