|
|
|
|
|
by lmeyerov
425 days ago
|
|
Yeah exactly We still want chunking in practice to avoid LLM confusion, undifferentiated embeddings, and handling large datasets at lower cost + large volumes. Large context means we can now tolerate multi-paragraph/page, so more like chunk by coherent section. In theory we can do entire chapter/book, but those other concerns come in, so I only see more niche tools or talk-to-your-PDF do that. At the same time, embedding is often a significant cost in above scenarios, so I'm curious about the semantic chunking overheads.. |
|