Hacker News new | ask | show | jobs
by frenchmajesty 235 days ago
OP here. I agree with you. For production use we use VoyageAI which is usually 2x faster than OpenAI at similar quality levels (p90 is < 200ms) but we're looking at spinning up a local embedding in our cloud environment, that would make p95 < 100ms and make cost negligible as well.