|
|
|
|
|
by danielmarkbruce
515 days ago
|
|
The LLM can't, thats what makes it relatively difficult. The tokenizer can. Run it through your head with character level tokenization. Imagine the attention calculations. See how easy it would be? See how few samples would be required? It's a trivial thing when the tokenizer breaks everything down to characters. Consider the amount and specificity of training data required to learn spelling 'games' using current tokenization schemes. Vocabularies of 100,000 plus tokens, many of which are close together in high dimensional space but spelled very differently. Then consider the various data sets which give phonetic information as a method to spell. They'd be tokenized in ways which confuse a model. Look, maybe go build one. Your head will spin once you start dealing with the various types of training data and how different tokenization changes things. It screws spelling, math, code, technical biology material, financial material. I specifically build models for financial markets and it's an issue. |
|
Well, as you can verify for yourself, LLMs can spell just fine, even if you choose to believe that they are doing so by black magic or tool use rather than learnt prediction.
So, whatever problems you are having with your financial models isn't because they can't spell.