|
|
|
|
|
by mips_avatar
155 days ago
|
|
Reply to your comment below this (since hn limits comment depth to 4). The 40ms latency is an average but 90% of queries are getting routed to a single index, latency is worse when the routing goes to 3. Since I already batch the embedding generation I should be able to get hard queries down to like 50ms. |
|