| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by imperio59 204 days ago
	From the author: > at some point we started benchmarking on wikipedia-scale datasets. > that’s when things started feeling… slow. So they're talking about this becoming an issue when chunking TBs of data (I assume), not your 1kb random string...

1 comments

groby_b 204 days ago

But the bottleneck is generating embeddings either way.

memchunk has a throughput of 164 GB/s. A really fast embedder can deliver maybe 16k embeddings/sec, or ~1.6GB/s (if you assume 100 char sentences)

That's two orders of magnitude difference. Chunking is not the bottleneck.

It might be an architectural issue - you stuff chunks into a MQ, and you want to have full visibility in queue size ASAP - but otherwise it doesn't matter how much you chunk, your embedder will slow you down.

It's still a neat exercise on principle, though :)

link

viraptor 204 days ago

It doesn't matter if A takes much more time than B, if B is large enough. You're still saving resources and time by optimising B. Also, you seem to assume that every chunk will get embedded - they may be revisiting some pages where the chunks are already present in the database.

link

groby_b 203 days ago

Amdahl's law still holds, though. If A and B differ in execution times by orders of magnitude, optimising B yields minimal returns (assuming streaming, vs fully serial processing)

And sure, you can reject chunks, but a) the rejection isn't free, and B) you're still bound by embedding speed.

As for resource savings.... not in the Wikipedia data range. If you scale up massively and go to a PB of data, going from kiru to memchunk saves you ~25 CPU days. But you also suddenly need to move from bog-standard high cpu machines to machines supporting 164GB/s memory throughput, likely full metal with 8 memory channels. I'm too lazy to do the math, but it's going to be a mild difference at O($100)

Again, I'm not arguing this isn't a cool achievement. But it's very much engineering fun, not "crucial optimization".

link