Hacker News new | ask | show | jobs
by megadragon9 38 days ago
I'm continuing to expand my own deep learning library [1] (PyTorch-clone built with Python and Numpy) to support LLM post-training techniques like supervised fine-tuning (SFT) [2] and reinforcement learning with GRPO [3] . It's a good learning experience to work without all the high-level abstractions to "build a wheel" and "use that wheel to build a car". Post-training results are still cooking, since training on my MacBookPro is quite slow with "unoptimized PyTorch" :)

1. https://github.com/workofart/ml-by-hand

2. https://github.com/workofart/ml-by-hand/blob/main/examples/s...

3. https://github.com/workofart/ml-by-hand/blob/main/examples/g...