Hacker News new | ask | show | jobs
by dfgtyu65r 835 days ago
Normally, the LLM is composed of multiple transformer blocks, where each block consists of the (mutli-head) attention and fully-connected feedforward components. These are then stacked on top of each other to give the final output of the network.