| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ACCount37 297 days ago
	Should be safe to do, as long as none of that is load bearing. If it's the usual naive "massage the image into a hundred tokens and throw that into the context" vision implementation, nothing bad would happen from removing or just freezing them. I've seen "cut off unused vision inputs" done for older multimodals, just not the newer Gemma 3.