HN Reader

NewTopBestAskShowJob
Show HN: FlashTokenizer – 10x faster C++ tokenizer for Python
score icon5
comment icon0
1 day agoby springkim
I built a tokenizer in C++ with a Python binding that outperforms HuggingFace tokenizers by 10x on large inputs. It's optimized for minimal memory usage and latency.

Benchmarks and comparison included in README. Would love feedback or contributions!

No comments