Hacker News new | ask | show | jobs
by mhitza 166 days ago
All open weights model I tried (that fit under 20GB of memory) easily loop.

I run models with llama.cpp and the reason why I add some repeat penalty factor.