Hacker News new | ask | show | jobs
by kramit1288 28 days ago
accurate memory estimation is key here. it will crash if that accurate and it cant be generic for all local llm. each local llm has different context estimates.