Hacker News new | ask | show | jobs
by rcarmo 500 days ago
Yeah, well, it's not _just_ PTX. Think about what you would do if you had to work in a resource-constrained system (that's a mindset I closely relate to since I still do C++ for MCUs, and it makes you dig _under_ the libraries to save resources).
1 comments

Totally, they did great work under their constraints. Training in FP8, the MLA thing they introduce in DeepSeek-V2, etc. I just take particular issue with the attention the PTX thing is getting because (a) it's not like other labs don't do stuff like that and (b) it doesn't contribute nearly as much to their outcome as the other algorithmic and operational improvements they've made.