Hacker News new | ask | show | jobs
by olaird25 17 days ago
Have you tried just using a faster tokenization library?

Github's BPE crate (https://crates.io/crates/bpe) advertises >10x speedup relative to HF, as do others.