Hacker News new | ask | show | jobs
by nael_ob 638 days ago
Yes they built a cool product! Actually, we aim to focus on companies feeding their LLMs by providing embeddings and chunkings out of the box on top of all the data we sync. We don't only help you connect with 3rd parties but also receive data that can be interpreted for AI use cases (e.g: RAG).
1 comments

The optimal chunking strategy is often highly, highly dependent on the data used and questions to be answered.

The net is plastered with blog posts about optimal strategies, of which there seem to be more than 10 and new approaches popping up often.

It seems consensus that trial and error is the way to go to optimize cost and performance.

How do you plan to tackle this when providing it out of the box?

That's why we wanted to try the OSS approach where contributors can help keep up with the optimal strategy. We also plan to build an engine to test each strategy and compare retrieval perf before choosing one at runtime.