Hacker News new | ask | show | jobs
by macilacilove 900 days ago
Most importantly due to the context length not being long enough. If the context length was long enough, it is possible that they could do it with clever training. I only trained much smaller language models though.