Hacker News new | ask | show | jobs
by choilive 802 days ago
This is the state of LLMs today - it is likely that we will have models in the future that can do some form of "online" training - or new training methods that aren't nearly as compute intensive. There are many people working on these scaling issues with LLMs today. We already have new attention heads that work around the quadratic time and space complexity of the input prompts.