| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by aeonfox 126 days ago
	I was thinking more if you write a prompt into an IDE that has first-party integration with an LLM platform (e.g. VS Code with Github Copilot), it would make sense on their end to reduce and remove redundant input before ingesting the token into their models, just to increase throughput (increase customers) and decrease latency (reduce costs). They would be foolish not to do this kind of optimisation, so surely they must be doing it. Whether they would pass on those token savings to the user, I couldn't say.