Hacker News new | ask | show | jobs
by xfalcox 302 days ago
Having a public tokenizer is quite useful, specially for embeddings. It allows you to do the chunking locally without going to the internet.
1 comments

and you don't have to re-embed everything if the provider sunsets a model.