| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by esperent 1202 days ago
	> people who inference their models at scale What does "inferencing models at scale" mean?

3 comments

cypress66 1202 days ago

Actually using the model, instead of doing inference a few times when writing your paper and then that's it.

link

andai 1202 days ago

Optimizing it to be cheaper to run (e.g. OpenAI saved a lot of money with the turbo variant that ChatGPT uses, relative to the original GPT-3 models).

link

Closi 1202 days ago

Running server farms to quickly give answers to users, like bing or ChatGPT.

The models produced have to be fast and efficient to support capacity/cost, which has some detriment to quality/accuracy.

link