Hacker News new | ask | show | jobs
by spxneo 812 days ago
so how would you go about improving that?
2 comments

Not using an LLM for it.
we only send 0.5-5% of traffic to gpt4, thanks to smaller faster cheaper models. So not all of our traffic is hit with 50s latencies :-/