Y
Hacker News
new
|
ask
|
show
|
jobs
by
generall
1042 days ago
With a few optimization tricks, TL;DR: - ONNX inference in Rust - Embeddings cache & lookup - Parallel & Batch requests - hybrid search with full-text filtering + vector re-scoring