Here's also nice tour de building blocks, which could also double as transformers/tensorflow API reference documentation: https://www.youtube.com/watch?v=eMXuk97NeSI&t=207s
The #1 visualization of architecture and size progression: https://bbycroft.net/llm