Hacker News new | ask | show | jobs
by Art9681 846 days ago
Using content appropriate delimiters to create the chunks. If its a Python document, split it at the proper delimiters. If its an HTML document, JSON, etc. We wouldn't want to split a chunk in the middle of a function or paragraph. Or maybe we would. But that is what OP is refering to I believe.
1 comments

That's a good point, and documents too needs different chunking techniques. You don't want to split a word file the same way you split an excel...