Hacker News new | ask | show | jobs
by kadushka 461 days ago
There’s not enough improvement over regular LLMs to motivate optimization effort. Recall that the original transformer was well received because it was fast and scalable compared to RNNs.