Hacker News new | ask | show | jobs
by Tiberium 4 days ago
> This isn’t because the model can’t count. It’s because it never sees the letters at all.

> The chunks aren’t characters and they aren’t words. They’re something more specific, and the specificity matters more than most people realize.

> Those numbers are real, but they hide what a token actually is.

> GPT-4’s vocabulary isn’t Claude’s. Claude’s isn’t Llama’s.

> The model never sees text. It sees a sequence of integer indices into its own private alphabet.

> So tokens aren’t “roughly like words” or “kind of like characters”. They’re the atoms of perception for one specific model, and they’re the only language that model speaks.

> The same sentence is nine tokens to GPT-4 and seven tokens to Llama 3. Not because Llama is smarter or the sentence changed, but because the two models have different vocabularies.

> That’s it. No clever scoring, no neural network.

Could people who use LLM to write articles at least prompt them to have a better style? I'm really tired of the default Claude style (a lot of Chinese models also reuse the same style)

1 comments

I appreciate the feedback. My main focus was on the visual elements, and not so much "ridding the text of AI-traces".

What did you think about the more visual elements?

Simon