Hacker News new | ask | show | jobs
by joshhart 933 days ago
Trainium and Inferentia sadly aren't going this way, they have their own approach https://github.com/aws-neuron/transformers-neuronx . I think the best case scenario is some middleware like Triton https://openai.com/research/triton becomes the standard and we can build adapters to all of these backends. It is disappointing though.