Hacker News new | ask | show | jobs
by jcutrell 1153 days ago
This is actually pretty insightful - I have done something similar with splitting my obsidian data into chunks using paragraphs and headers as demarcation, but this solves a more interesting problem of nuance! I like it.
2 comments

If you're interested in improved chunking, I mentioned a few strategies in my talk here (timestamp linked, <1min): https://youtu.be/elNrRU12xRc?t=536 that I used when building https://findsight.ai
If you're already splitting documents by paragraph, consider using (as much as possible of) the previous and next paragraphs as overlap.