| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by turnsout 1196 days ago
	Amazing work so far! Do you have any sense about how difficult it would be to enable M1/M2 or CoreML support?

2 comments

gkucsko 1196 days ago

thanks, the model itself is a pretty vanilla gpt model based heavily on karpathy's nanogpt, so should not need too many bells and whistles to get it running on specific architectures. that said i have very little experience with platform specific development, so would looove some help from the community :)

link

brucethemoose2 1196 days ago

Could you stick a torch.compile in the inference and training code, maybe gated behind a flag? This should help AMD/Nvidia performance (and probably other vendors soon) significantly.

PyTorch themselves used nanoGPT training as demo for this: https://pytorch.org/blog/accelerating-large-language-models/

link

ttul 1196 days ago

A serious nod to Karpathy here. They could have chosen any other Transformer architecture, but chose perhaps the most reachable one - in the literal sense.

link

tmzt 1196 days ago

Would the same apply to a GGML port or are the architechtures too different?

link