|
|
|
|
|
by blackeyeblitzar
791 days ago
|
|
> It looks like a mid-level implementations of training and inference I’m not familiar with how any of this works but what does state of the art training look like? Almost no models release their training source code or data sets or pre processing or evaluation code. So is it known what the high level implementation even is? |
|
This is probably a good baseline to start thinking about LLM training at scale.