| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by 123yawaworht456 394 days ago

>In my local ai (mistral-nemo) around 10 thousand tokens of context decreases my token gen speed from 70t/s to 20 t/s . And the LLM starts ignoring the context after a while.

as much as it pains me to say this, only cloud models are somewhat viable for this. AI-powered NPCs are my dream too, and after many attempts with countless local and cloud models, I've given up for now. locals are retarded and incurably sloppy, clouds can be tard-wrangled into producing somewhat decent prose, but they are prohibitively expensive.

mistral models are particularly soulless and full of cliches.

https://eqbench.com/creative_writing_longform.html

https://eqbench.com/results/creative-writing-longform/mistra...