| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dzink 53 days ago
	My push for trying local was the wildly unpredictable but systematic performance of large models like Opus and ChatGPT. It feels like at different times of day or week they are getting degraded beyond belief. I don’t know if it is deliberate, a function of demand, or a function of the models themselves. We are all learning the shape of this space by trying. I need to be able to rely on consistent performance - and maybe that means putting some harness of benchmarks between models and maybe it means between different inference providers, and maybe local.