Hacker News new | ask | show | jobs
by 2-718-281-828 848 days ago
but then it will either overfit or you need to train it on 20 times the amount of data ...
1 comments

I'm taking about when using a LLM, which doesn't involve training and thus no overfitting.
for an llm to exhibit a verbal relationship between counting and tokens you have to train it on that. maybe you mean something like a plugin or extension but that's something else and has nothing to do with llms specifically.