Hacker News new | ask | show | jobs
by 10GBps 18 days ago
I learned TCP/IP by watching and reading raw packets over packet radio at 1200 baud.

I've noticed the same thing is possible if you watch the output of a slow LLM. Eventually you start to see the machinery. input tokens = output tokens, it's math. I can't exactly predict the tokens generated but I can see how they are formed. It's a lot like chess. You can't see every possible move but the mechanism is understandable.

4 comments

Comment <-> username synergy.
https://distill.pub/2019/activation-atlas/

I can only imagine what sort of visualizations are going on today inside of the AI labs.

It's basically possible build an LLM using just routers+packets, and then hook them up to Wireshark to see it compute!
How would I set this up?
I'd recommend to maybe also specifically watching Karpathy's videos and focusing on the early parts where he specifically deals with tokenization / embeddings generation (which gets really overlooked), and he does this in most of his videos.