| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by agnishom 165 days ago
	> LLMs tokenize English words efficiently. Symbols like {, }, === fragment into multiple tokens. Words like "plus", "minus", "if" are single tokens. The insight seems flawed. I think LLMs are just as capable of understanding these symbols as tokens as they are English words. I am not convinced that this is a better idea than writing code with a ton of comments