Hacker News new | ask | show | jobs
by samus 822 days ago
Character-level operations are difficult for LLMs. Because of tokenization they don't really "perceive" strings as a list of characters. There are LLMs that ingest bytes, but they are intended to process binary data.