| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zozbot234 4 days ago
	If input tokens dominate the cost to that extent, this implies that major gains are possible by making better use of caching. You could basically ask the model to do a one-time "compaction" step including a dump of the relevant portions of the code, and use that as the cached prefix for a large amount of "swarm" subagent calls.