|
|
|
|
|
by janalsncm
805 days ago
|
|
In their defense, it’s because the article is (understandably) sparse on details about what makes the requirements of their ranking models different from image classification or LLMs. Unless you work in industry it’s unlikely you will have heard of DeepFM or ESMM or whatever Meta is using. And building out specialized hardware does lock you in to a certain extent. Want to use more than 128GB of memory? Too bad, your $10B chip doesn’t support that. |
|
Which is probably why Meta is also buying the biggest Nvidia datacenter cards by the shipload. There is no need to run inference for a small model - say for a text-ad recommendation system - on an H100 with attendant electricity and cooling costs.