Hacker News new | ask | show | jobs
by 2099miles 808 days ago
The LLM itself should realize it’s too big and only put the important parts on the gpu. If you’re asking questions about literature there’s no need to have all the params on the gpu, just tell it to put only the ones for literature on there.