Y
Hacker News
new
|
ask
|
show
|
jobs
Show HN: FlashTokenizer – 10x faster C++ tokenizer for Python
(
github.com
)
5 points
by
springkim
437 days ago
I built a tokenizer in C++ with a Python binding that outperforms HuggingFace tokenizers by 10x on large inputs. It's optimized for minimal memory usage and latency.
Benchmarks and comparison included in README. Would love feedback or contributions!