|
|
|
|
|
by groby_b
162 days ago
|
|
But the bottleneck is generating embeddings either way. memchunk has a throughput of 164 GB/s. A really fast embedder can deliver maybe 16k embeddings/sec, or ~1.6GB/s (if you assume 100 char sentences) That's two orders of magnitude difference. Chunking is not the bottleneck. It might be an architectural issue - you stuff chunks into a MQ, and you want to have full visibility in queue size ASAP - but otherwise it doesn't matter how much you chunk, your embedder will slow you down. It's still a neat exercise on principle, though :) |
|