Hacker News new | ask | show | jobs
by ParetoOptimal 884 days ago
Typically when a 6.7B model or similar beats a 33B model it's not really true in my experience. At the least I have very a high burden of proof before believing it.