|
I’ve spent the last few months building a deep learning engine completely from scratch in Python (using only math and random). What started as a basic linear algebra calculator project grew into a symbolic tensor system with autodiff, custom matrix ops, attention mechanisms, LayerNorm, GELU, and even a text generation demo trained on the Brown corpus. I'm still an undergrad, so my main goal is to deeply understand how deep learning actually works under the hood - gradients, attention, backpropagation, optimizers - by building it step-by-step with full visibility into everything, and without relying on big frameworks or libraries. It’s not fast or production-ready, but that’s not the point. As of now, it’s more so aimed at exploration and understanding. I mainly wanted to explore how deep learning works by building it through first principles. It’s still a work in progress (lots to learn and improve in terms of structure, docs, and performance), but I figured it was worth sharing. I’d love any feedback, questions, ideas, or even just thoughts about what you’d add, change, or do differently.
Thanks for reading! |