Hacker News new | ask | show | jobs
by dalys 1150 days ago
1M token is around 750k words, in English.

According to Wolfram Alpha that is:

Single-spaced document: 1500 pages

Double-spaced document: 3000 pages

Book: 1028 pages

So around 1-5 books.

I'm assuming they're using OpenAI's tiktoken tokenizer. (??)