Hacker News new | ask | show | jobs
by arktiso 1001 days ago
Wow, the latency on requests feels great!! I’m really curious: is this running entirely with Python?
1 comments

100% Python but with a good deal of multiprocessing, speculative decoding, etc. As we move to production we can probably shave another 100ms off by moving over to a compiled system, but Python is great for rapid iteration.