|
|
|
|
|
by hospitalJail
1107 days ago
|
|
> I get using a proper tokenizer and just calling `strings.Split`, and it seems to be remarkably stable for a given model and language (multiply the length of the result of splitting on spaces by 1.55 for OpenAI and 1.7 for Claude, which leaves a tiny safety margin). One time I suggested this, got downvoted to hell. To be fair to the downvoters, I quoted OpenAIs 7 tokens per word(on their tutorial page). Seems incredibly unrealistic in hindsight, but at the time, things were fresh. Also, I think most people wanted something more robust than a linear calculation. |
|